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Abstract 

This paper considers a cross-layer optimization problem driven by multi-timescale stochastic ex- 
ogenous processes in wireless communication networks. Due to the hierarchical information structure 
in a wireless network, a mixed timescale stochastic iterative algorithm is proposed to track the time- 
varying optimal solution of the cross-layer optimization problem, where the variables are partitioned 
into short-term controls updated in a faster timescale, and long-term controls updated in a slower 
timescale. We focus on establishing a convergence analysis framework for such multi-timescale 
algorithms, which is difficult due to the timescale separation of the algorithm and the time-varying 
nature of the exogenous processes. To cope with this challenge, we model the algorithm dynamics 
using stochastic differential equations (SDEs) and show that the study of the algorithm convergence 
is equivalent to the study of the stochastic stabiUty of a virtual stochastic dynamic system (VSDS). 
Leveraging the techniques of Lyapunov stability, we derive a sufficient condition for the algorithm 
stability and a tracking error bound in terms of the parameters of the multi-timescale exogenous 
processes. Based on these results, an adaptive compensation algorithm is proposed to enhance the 
tracking performance. Finally, we illustrate the framework by an application example in wireless 
heterogeneous network. 
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I. Introduction 

Cross-layer resource optimization plays a critical role in the radio resource management of modern 
wireless systems. In existing literature, cross-layer optimization can be divided into two categories. 
When the system states are slowly varying, it is desirable to have dynamic controls adaptive to the 
instantaneous realizations of system state. For example, in |[1|, l|2|, the authors considered dynamic 
power control (adaptive to the instantaneous channel fading state) in wireless ad hoc networks. In 
|[3|, Q, adaptive joint beam-forming is considered to mitigate interference in a cellular network. In 
practice, it is quite difficult to obtain real-time observations of the global system states or real-time 
signaling message passing in a large scale network because of the signaling latenc}{^ As a result, it 
is desirable to adapt the control actions to the system state statistics instead of real-time realizations. 
For example, in |5] the author developed a limited feedback technique that utilizes the channel 
distribution information (CDI) for communication in multiuser MIMO beamforming networks. In 
|[6|, ||7|, the problem of robust transmit beamforming in multi-user communication system using the 
covariance-based channel information is considered. 

In practice, system states usually evolve in mixed timescales in wireless networks. For example, 
in MIMO fading channels, the channel matrix changes in a short timescale (such as 10 ms) but the 
correlation and the path loss change in a longer timescale (such as minutes) ||8|, |[9|. Another example 
of mixed timescale state evolution is between the queue length process (slower timescale) and the 



instantaneous link quality (faster timescale) |10|, |11|. When the system has a multi-timescale state 



evolution, it is necessary to partition the controls in different timescales based on the information 
structure induced by the system topology. As an illustration, consider a wireless heterogeneous 
network with a macro base station (BS) and some relay BSs (RSs) as depicted in Fig. [1] The users 
transmit data flows to the macro BS with the assistance of the RSs. The control strategies depend 
on the channel state which evolves in mixed timescales. Suppose we want to adopt a cross-layer 
control to the flow data rate r for each user and the transmission power p on each link so as to 
maximize the average throughput in the example wireless network with time-varying channels. As 
the control policy may involve network-wise coordinations, one good strategy is to partition the 
control variables into local power control p and global flow control r, where the power control p 



'For example, the X2 interface in e-Node B of LTE systems lias latency of 10ms or more. 
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Figure 1. The topology of a wireless relay network with a radio resource management (RRM) server. The BSs and the 
RSs have local real-time CSI, while the RRM server has the long-term global CSI. 



adapts to the instantaneous channel state information (CSI) locally and the flow rate r adapts to 
the CSI statistics with a global coordination. This is because, while it is realistic for each wireless 
node to acquire real-time local CSI, it is extremely difficult for the network controller to acquire 
real-time global CSI. If one considers pure fast timescale control for both the power and the flow 
data rate (such as in |[2|, |12|), the policy obtained will require real-time global CSI. This is difficult 
to achieve in practice and the system performance will be very sensitive to the signaling latency in 
the acquisition of global CSI. On the other hand, if one considers pure slow adaptation for both p 
and r (statistical adaptation), the resulting policy will fail to exploit the instantaneous transmission 
opportunity observed at each wireless node, and such an approach may be too conservative. Therefore, 
it is of great importance to have a complete cross-layer control framework with timescale separations 
that embraces the exogenous mixed-timescale state evolutions and exploits the information structure 
of the network topology. 

There are quite a few works that studied controls with different timescale state evolutions [1], 
|[l3)-| 17 1. However, these works handled the different timescales separately in a heuristic manneij^] 
There are few works that considered a holistic cross layer optimization framework exploiting mixed 
timescale algorithms, not to mention the study of convergence properties of such mixed-timescale 
algorithms. 

In this paper, we focus on the study of general cross-layer optimization for mixed timescales state 
processes. We first setup a stochastic optimization formulation to optimize an average network utility. 
The control variables are partitioned into a short-term control (adapts to fast timescale state processes) 



^For example, the fast control and slow control are not optimizing the same optimization objective. 
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and a long-term control (adapts to slow timescale state processes) according to the information 
structure induced by specific network topology. These control variables are driven by mixed-timescale 
iterative algorithms to optimize the average network utility (objective function). An important question 
to the mixed timescale iterative algorithms is whether they will converge to the optimal solution. 



While the convergence of some iterative algorithms such as stochastic gradient |18|, |19| are quite 
well-studied, these existing techniques considered one timescale iteration only and the convergence 
behavior of mixed timescale algorithms is highly non-trivial due to the mutual coupling between 
the short-term and long-term control variables. Specifically, there are several first order technical 
challenges that need to be addressed. 

• Coupling in the dynamics of the long-term and short-term iterations: In most of the existing 
works involving multi-timescale control variables, the iterative algorithms driving different types 
of control variables are independent from each other. In other words, the intermediate iterates 
of the long-term variables in the outer-loop will not affect the convergence of the short-term 
variables in the inner loop. However, this decoupling is only justified when the short-term 
variables can converge to the optimal point arbitrarily fast or when we have closed-form solutions 
for the short-term variables. In general, we do not have closed-form solutions and each iteration 
in communication networks may also involve explicit signaling message passing. Hence, it may 
not be realistic to ignore the iteration dynamics in the short-term variables. Due to these couplings 
of the iterations between the long-term and short term variables, classical convergence proof in 



stochastic gradient |18|, |19| or stochastic programming |20|-|22| cannot be applied. 
• Irreducible bias to the stochastic estimator: In standard single timescale stochastic optimization 



with long-term control variables only, stochastic gradient |18|, |19| is commonly used because 
the true gradient of the problem contains an expectation operator which does not have closed 
form expression in most cases. Using an unbiased stochastic gradient estimator |[T8|, p9| , we do 
not need to compute the true gradient in every iteration. If the original problem is strictly convex. 



the stochastic gradient algorithm will converge to the optimal solution |18|, |19|. However, in 
the mixed timescale iterations, the convergence errors of the short-term control variables in the 
inner loop will induce an irreducible bias to the stochastic gradient estimator of the long-term 
control variables in the outer-loop. As such, standard convergence proof in stochastic gradient 



1 18 1, 1 19 1 cannot be applied. 

Exogenous Stochastic Variations of the State Processes: In addition to the coupUng in the 
mixed timescale iterations as well as the irreducible bias due to the short-term control iterations, 
the system states of the wireless system are also evolving stochastically. For example, the long- 
term state process (such as the channel path loss, the MIMO channel correlations or even the 
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network topology) may be time-varying due to the mobility of the users in the network or 
shadowing process. As such, these exogenous variations will also have a profound impact on 
the convergence behavior of the mixed timescale iterations. 

A. Related words 



In 1 23 1, 1 24 1, we have studied the convergence behavior of iterative algorithms in wireless systems 
under time-varying channel states. However, these works have focused on one-timescale iterations 
and the approach cannot be applied to address the above challenges in mixed-timescale iterations. 



There are also limited works on multi-timescale stochastic optimizations. In |20|-|22|, multi-stage 
stochastic programming has been applied for logistic and financial planning problems. However, the 
problem considered has very a special form (Unear program (LP)) and the solutions cannot be applied 
to our problems, which has a more general non-linear form. There are also some application examples 
Q, p3|-p7} that decompose the stochastic optimization problems into two timescale hierarchical 
solutions. However, all these works did not consider the tracking issues associated with exogenous 



variations of the problem parameters. In |25|, |26|, the authors studied the stochastic tracking al 



gorithms in a regime- switching environment which is driven by a finite state two-timescale Markov 
chain. A dynamic step-size selection algorithm for the tracking in a regime-switching environment 
has been proposed in [27|. Nevertheless, a general understanding of the convergence behavior of 
mixed-timescale algorithms is still needed. 

B. Our contribution 

In this paper, we propose an analysis framework to study the convergence behavior of mixed 
timescale iterative algorithms of cross-layer stochastic optimization for wireless networks. Specifically, 
we first introduce the cross-layer stochastic optimization framework and the variable partitioning ac- 
cording to the information structure available. We then consider a mixed timescale iterative algorithm 
and study the dynamics and coupling using continuous time stochastic differential equation (SDE). We 
show that the study of the convergence behavior of the mixed timescale algorithm is equivalent to the 
study of stochastic stability of a virtual dynamic system specified by a system of SDE. Furthermore, 
the optimal solution of the stochastic optimization problem is equivalent to an equilibrium point of the 
virtual dynamic system. As a result, the convergence behavior of the algorithm is similar to the control 



problem of tracking a moving target |23|, |24|. Based on this insight, we derive an upper bound on 
the tracking error of the mixed timescale algorithms under exogenous variations of the short-term and 
long-term state processes. Based on the insights of the impact of exogenous variations, we propose 
a low complexity compensation algorithm which can substantially enhance the tracking behavior of 
the mixed timescale algorithms. Finally, we apply this framework to an example topology in wireless 
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heterogeneous networks with relays. Numerical simulations verified the theoretical insights obtained 
and also demonstrated significant performance gain of the proposed compensation algorithms. 

The paper is organized as follows. Section |ll] illustrates the system model, where the cross-layer 
stochastic optimization framework and the mixed timescale stochastic approximation algorithm will be 



introduced. Section III studies the tracking behavior of the mixed timescale algorithm using the notion 
of virtual stochastic dynamic system (VSDS) and the techniques of Lyapunov stochastic stability. In 
Section [V| a novel compensation algorithm is proposed to enhance the tracking performance. An 
application example on flow control and power allocation in wireless relay network is illustrated in 



Section VI Section VII gives the numerical results, and we conclude the work in Section VIII 

Notations: For a sealer-valued function F : i— )• M, Fx denotes the vector of its partial derivatives 
with respect to vector x = (xi, . . . i.e., the i-th component of F^ is fJ*^ = For a vector- 
valued function G : W ^ M™, denotes the Jacobian matrix of G = {G^^\ . . . which 
is the partial derivatives of G{.) with respect to vector x, and whose {i,j)-th element is given by 
. The notation [xj denotes the largest integer that is no greater than x. For column vectors 
X = [xi ... x„]^ and y = [yi . . . ymf, {x,y) = [xi . . . x„ yi . . . y^]^ denotes a column vector 
that stacks the vectors x and y. 

II. System Model 

In this section, we first introduce the cross-layer stochastic optimization framework with mixed 
timescale exogenous random processes. We then partition the control variables into short-term and 
long-term controls and describe the mixed timescale iterative algorithms to track the optimal solutions 
of the stochastic optimization problem. Based on that, we elaborate the convergence issues induced 
by the exogenous random processes and formally define the tracking error between the algorithm 
outputs and the optimal solution. 

A. A Cross-Layer Stochastic Optimization Formulation 

1 ) Network and Mobility Model: We first discuss the system model in a wireless communication 
network with node mobilities. Consider a wireless network with Ng static nodes and Nm mobile 
nodes. The location of the static nodes are fixed, and the mobile nodes are randomly distributed at 
time t = 0. For time t > 0, the mobile nodes change their locations according to a widely adopted 



Levy walk mobility model |28|-|30| described as follows. 

Assumption 1 (Levy walk mobility model): Starting from time t = 0+, each mobile node chooses a 
random destination in a restricted region centered at the initial location and moves at a constant speed 
in (0,Wmax]- Upon reaching the destination, the node pauses for some time and randomly chooses a 
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new destination and speed to go on. The travel distance and pause time at each step follow a truncated 
Levy distribution ||29|, |[30|. ■ 

Nodes are inter-connected through wireless links, and since the node mobility is restricted, we 
assume that the network topology is fixed despite the mobility of the mobile nodes. The network 
topology can be represented by a directed graph Q = {J\f,C), where J\f is the set of nodes and C is 
the set of wireless links that connect the transmitters and the receivers. Fig. [T] illustrates an example 
wireless network, where M is the set of BS, RSs and mobile users, and C is the set of wireless 
transmission links between them. 

Define the CSI for the j-th link as hj G %. We adopt a fading model to the CSI hj as hj = h^'jhj, 
where /i^ = cqDJ'' G 'H'- is the long-term CSI for the large-scale fading with cq being an antenna-gain- 



related constant, and hj G Ti^ is the short-term CSI for the small-scale fading |j9|, |31 1. Dj > Dmin 
is the distance between the j-th link, and l is the path loss exponent. 

2) The CSI Dynamics: We specify the dynamics of the CSI hj and /i^ by the exogenous random 
processes defined below. 

Definition 1 (Exogenous stochastic processes): The short-term and long-term CSI processes hj{t) 
and are driven by the following stochastic differential equations (SDE): 

dh'j = -^anh'dt + ^dWt, /i|(0) = /ig, Vj = 1, . . . , iV, (i) 

dh^j = -coLDj{t)-'-^Vj{t)dt, ^(0) = Vj = 1,...,N, (2) 

where an > 0, vj is the relative speed of the j-th link, and Wt is a standard Brownian motion. ■ 
The positive parameter an specifies the time-correlation of the short-term exogenous processes 
{hj{t)}, which are specified by the Ornstein-Uhlenbeck processes. hj{t) has a Gaussian stationary dis- 
tribution and \hj\ follows a Rayleigh distribution ijs]], ||9|. On the other hand, defining e(Dmin, ^^max) — 
2cotDjjj-n ^Wmax, we have | < e(Z)min, Vmax)d*- The parameter e, which is typically very small, 
represents the timescale of the long-term processes 

3 ) Control Variables and the Problem Formulation: We assume the following information structure 
for the wireless communication network. 

Assumption 2 (Signaling information structure): Each node k & M has knowledge of the local 
CSI (short-term and long-term) (hj, h}-) for all the links j ^ C that connect to node k. On the other 
hand, only the global long-term CSI U = {hi, ... , /i^^) is known at the central controller of the 
network. ■ 

For example, in Fig.[T] node 5 only has the local CSI knowledge of /13, and /15. Meanwhile, the 
RRM server has the long-term global CSI /i^ for all the nodes j. Note that the above assumption is 
quite reasonable, because in practical wireless communication networks, it is relatively easy for each 
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node to acquire local real-time CSI but it would be difficult for the network controller to acquire the 
global real-time CSI. 

According to the information structure assumption, the following defines the mixed timescale 
controls in the cross-layer optimization framework. 

Definition 2 (Mixed Timescale Controls): The system has two sets of control variables, namely the 
short-term control and the long-term control. The short term control is denoted by a policy Q.^ which 
maps the realization of CSI vector h = (hi, ... ^h^) to an action 6^ S mI^" . Similarly, the long-term 
control is denoted by a policy which maps the reahzation of = {h^,..., U^) to an action 

For example, the short-term control 9x{t) = ^^{h{t)) and the long-term control Oy{t) = Cl\h''{t)) 
may correspond to the transmit power control on each wireless link and the flow control of a wireless 
network, respectively. The mixed timescale cross-layer stochastic optimization problem is given as 
follows. 

Problem 1 (Mixed timescale cross-layer stochastic optimization problem): 

VoiaH, e) = max E Oy-, h{t))] (3) 

subject to Wi{6x,9y;h(t)) = 0, i = l,...,We (4) 

Wi{ex,ey;h{t))<0, i = We + l,...,W (5) 

qj{ey;h\t))<0, j = l,...J,yh\t). (6) 

■ 

Vo can be decomposed into a family of inner problems and outer problems: 
Problem 2 (The inner problem): For given and hf. 



Piiey,h',h^) = max F{9x,ey;h',h^) (7) 

Ox ^0 

subject to Wi(ex,Oy-h(t)) = i = l,...,We (8) 

Wi{ex,ey;hit))<0, i = We + l,...,W (9) 

■ 

Problem 3 (The outer problem): For given /i', 

p2(/i') = max E[riiey,h',h^)\h^] (10) 

subject to qj{dy; h\t)) < 0, j = 1,...J. (1 1) 
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The collection of outer-problem solution 0*{h}) for each U in (|To|)-(|TT[) gives the optimal long-term 
policy Q}* . Similarly, the collection of the solution for the inner problem Vi{9*,h\h^) for a 

given (/i*, /i') gives the short-term optimal policy Q.'^* . 

We have the following assumption on the optimization problem Vq. 

Assumption 3 (Properties ofVo): We assume the following for the problem Vq, 

• Convex domain: The constraint domain specified by Q-Q is convex. 

• Concave objective: Given any CSI variable h £ Ti, the objective function F{6x, By] h) is strictly 
concave over [9^, Oy) € ]R+=" x 

• Smoothness of 9*{h''): Define an implicit mapping ijj : h} ^ 9* from the long-term CSI h} to 
the optimal solution 9*{h}'), i.e., ^(/i') solves the outer problem V2{h}') in (10 1. There exists a 
constant w < oo, such that - < ^IIC - for all G V}. 

■ 

Note that in the problem Vq, we may exclude equality constraints. However, one may always 
eliminate the equality constraints by either substituting them into the objective function or using the 



Lagrangian primal-dual method p2| , p3| . Moreover, the last assumption ensures that there is no 
jump in the optimal solution 9*{h^) along the long-term CSI hK 

Due to Assumption [3] there exists a unique optimal solution {Q.'^* ^Q}*) for the problem Vq. 

A strong motivation to such control variable partitioning is due to the information structure of local 
real-time and global real-time CSI observations in wireless networks. Due to the latency involved in 
global signaling of wireless networks, it is not scalable to adapt all the control variables to the fast 
varying short-term CSI h^. On the other hand, adapting the control only to the slow varying long-term 
CSI h} would be too conservative because it fails to exploit the real-time local CSI observations (t) 
for opportunistic diversity gain |31j |. One favorable way to strike a balance between performance and 
scalability is to partition the control variables as long-term y and short-term control 9y{t) = Q^{h{t)). 
Another motivation of the control timescale separation is the layered control architecture widely 
adopted in the wireless system |[1|, p4| . For example, the short-term control policy may correspond 
to physical layer control (e.g. power adaptation) and the long-term control may correspond to upper 
layer control (such as routing and admission control). 

B. Example: Power and Flow Control in Wireless Relay Network 

We illustrate the mixed timescale control with an example network with a macro BS and Nr 
RSs. A set of end users E each transmits one data flow to the macro BS with the assistance of 
a collection of RSs TZ. Fig. [T] illustrates a specific network topology with = 2 RSs, Nm = 4 
mobile users, iV = 6 links, and £ = {1, ... , 6}, £ = {1, 2, 3, 4}, 7^ = {5, 6}, = £" U 71 U {0}, 
where node denotes the macro BS. Wireless links towards a common receiving node share the same 
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time-frequency resource and multi-user detection (MUD) is used to handle cross-link interference. 
The maximum achievable transmission data rate at receiving node m is a set of rates that satisfy 



the following constraints pl| , 

J^Cfe <log (l + , V5c£+(m) (12) 

where £"^(m) denotes the set of links that inject data flows to the receiving node m. For example, 
in Fig. [l] £+(5) = {3, 4}, £+(6) = {2, 5} and £+(0) = {1, 6}. 

Denote rj as the flow rate from node j ^ £ and 0''(h') as the flow control policy that maps the 
large-scale fading variable h' = {h[, . . . , /i^) to the flow control r = (ri, . . . , Denote pj. as the 
power allocation on link k £ C and Q^{h.) as the power allocation policy that maps the CSI vector 
h = {hi,. . . , h]\f) to the transmission power p = {pi, . . . ,pn)- Corresponding to a two-timescale 
stochastic maximization can be formulated as follows, 

Problem 4 (Power and flow control in wireless relay network): 



max E 



(13) 



subject to Efce^Cfc <log(l + Efce5l^fclV) VcS C Vm G 7^ U {0} (14) 

Cfc = Eiec(fc) yk€C (15) 

Efce£+(m)Cfc-EfcG£-(m)Cfc = VmG7^ (16) 

where C{k) denotes the collection of data flows rj that routes through link k, and C^{m) denotes 
the set of links that carry data flows from the transmitting node m. ■ 
For example, in Fig. [T] C(4) = {r4}, C(5) = {r3,r4}, C(6) = {r2,r3,r4}, and C~{k) = k, for 
A; = 1, . . . , 6. We consider proportional fair utility p5| in the objective function of ( [T3] ). Constraints 



(14i-(15l are the MUD capacity constraints, and constraints in (16 1 are the flow balance constraints, 
where the incoming data flow should be balanced with the outgoing data flow at each RS. 

With timescale separation of the control and , the above stochastic maximization can be 
decomposed into families of inner problems and outer problems. 

Problem 5 (Inner power control): For given flow data rate r and CSI h, 

Q(r;h)=max -y^ZkecPk (17) 

subjectto Efc65Cfc(r) <log(l + Efe65l^fcP^'fc)' V5 C £+(m), Vm G 7^ U {0} (18) 

where Cfc(r) = Ei6C(fc) ^i' Vfc G £. ■ 
Problem 6 (Outer flow control): For given the large-scale fading variable h', 



max E 



E,e^log(r,)+Q(r;h) 



(19) 



subjectto Efee£+(m)Cfc(r)-Efce£-(m)Cfe(r) = Vm G 7^. (20) 
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Table I 



Problem associations between the example and the mixed timescale system model. 



In this example, the power allocation p in ( [13] ) corresponds to the short-term control variable 6x 
in ([3]), and the flow data rate variable r in ( 13 1 corresponds to the long-term control variable 9y. The 
objective function in Q is specified by 

F(p, r; h^ h') = log(^j) -VT.P''- 

Meanwhile, the inner problem constraints ([8]l-([9]) and the outer problem constraints ( [TT] ) are specified 
by (18 1 and (20 1, respectively. Moreover, the short-term control policy in Definition [2] is specified 
by the power control p(t) = QP{h{t)) in this example. Similarly, the long-term control policy ft^ 



is specified by r{t) = r2''(h') here. One can check that the objective function (13 1 is concave in 



(p, r) and the optimization domain specified by (14i-(16l is convex. In addition, as the Lagrangian 
function |32|, p3| of the constrained problem ([T3])-( 16 1 is continuous in h, the smoothness condition 
in Assumption [3] is satisfied as well. Table |l] summarizes the associations between the example and 



the mixed timescale system model in Section |II-A 



The control variable partitioning is motivated by the information structure of the wireless relay 
network. On one hand, the CSI hk is available at some corresponding receiving node m = {k : 
k G £+(m)}. Hence the inner problem is solved locally at each receiving node m £TlU {0} based 
on real-time CSI {hj. : VA; G £+(m)}. On the other hand, solving the outer problem with the flow 
balance constraints ([16]) (or ( ]20] )) requires a global coordination. Updating the variable r needs to 
have the global knowledge of Q{r; h) and the statistics of the long-term CSI h'. It also needs to 



handle the global coupling induced by the flow balance constraints (16) (or (20)). As a result, explicit 
message passing is involved and the update can only be done in a longer timescale. 
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Figure 2. An illustration of the two-timescale algorithm. 



C. Iterative Solution of the Mixed-Timescale Stochastic Optimization Problem 

In this section, we discuss a stochastic gradient based two-timescale algorithm for solving Vo{aH, e), 
which consists of an inner iteration and an outer iteration. We are interested in the case where the 
inner and outer problems Vi and V2 do not have closed form solutions and iterations are needed to 
find the optimal solution. 

Let X = {Ox, y^x) be the variable for the inner iteration, where 9x is the short-term control variable 
in (jv]) and A^^ is an algorithm specific auxiliary variable|^ Similarly, let y = {6y,Xy) be the variable 
for the outer iteration, where 6y is the long-term control variable in ( [T0| ) and Xy is the auxiliary 
variable. The time is partitioned into frames with duration r and slots as illustrated in Fig. [2] One 
frame consists of Ng slots and local real-time CSI is acquired at each node at the beginning of each 
frame. During the n^-th slot and the rif-th frame (nj = [ns/Ns\), the short-term variable Xn^ and 
long-term variable are updated according to the following mixed-timescale iterations. 



V 



x{y) 



Xn^-l + 'yn,G{Xn,-i,yn/,h''{nfT),h\nfT)) 



Vrif = Vy ynf-l+ PnfK{Xn,,ynf~-i;h''{nfT),h\nfT)) 



(21) 
(22) 



where and are the step size sequences, Vt> [•] is an Euclidian projection onto the domain V, 
and X{y) and y are the domains related to the problem constraints ([8])-(|9]l and (111. Fig. [2] illustrates 



the timescales of the iterations in (21i-(22i. Each node acquires local CSI at each frame boundary 



and updates the short-term control variables Xn^ once at each slot according to (21 1. The centralized 



controller updates the long-term variable once every frame according to (22i. 



In addition, we consider the following assumptions on the iterations (21i-(22). 



^In Lagrange primal-dual methods, is the Lagrange multiplier. 
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Assumption 4 (Properties of the iterations): Denote M-{.) = {G{.), K{.)) as the mapping of the 
joint iteration vector {xn^^ynf)- We assume the following properties: 

• Definiteness of the iteration mappings: The Jacobian matrices of the iteration mappings Gx{-) — 
VxG{.), Ky{.) = VyK{.), and VA4 have the following properties: There exists Ux, cty, a > 0, 
such that x^GxX < — a^^Hxp, y'^Ky-y < —ayWyW"^, and (x, y)^VM{x, y) < —a (Hxp + ||y|p) 
for all (x, y) G {UyeyX(y)) x y, given any (/^^ h^) x UK 

• Lipschtiz continuous and bounded growth: There exist positive constants Ix and ly, such that 
\\K{x,yi].) - K{x,y2:,.)\\ < ly\\yi -y2\\, for all yi and y2, and \\Kx{x,y] .)x\\ < lx\\x\\, for all 

X. 



Note that the first assumption is easily satisfied in a compact domain. It is needed to guarantee 



the iterations (21i-(22) to be stable under exogenous variation of h{t), where the parameters ax 
and tty indicate the convergence speed of the inner iteration and the outer iteration, respectively, 
and a indicates the convergence speed of the whole algorithm (under one-timescale). The second 
assumption is a standard assumption for studying the convergence. In this paper, we are interested in 



the convergence of ( 2 1 )-( 22 1 to the stationary points defined as follows. 

Definition 3 (Stationary points of (|27|-(|22]) j.- Given y and (/i*, /i'), the partial stationary point x{y; h^, /i') 
of ( [2T] ) is given by the solution of 

X + G{x, y; h"" , /i') 



x-V 



0. 



Similarly, given and h , the stationary point (x* (h) , y* {h )) of (21)-(22i is given by the solution 



of 



x-V 



x + G(^x,y;h'',h^) 



y-Vy y + EK(x,y;h\h^) 








Under Assumption [4| the partial stationary point and the stationary point defined above is unique. 

Remark 1 (Interpretation of the projection): The projection Vv{6) is to find the nearest point x G 
V from 9, i.e., x = Vv{d) is the solution to the minimization problem min^ ||x — 6*112, subject to 
X ^V. Note that if the constraint set is a hyper-rectangle, i.e., V = n^^[aj,6i], the projection can 
be computed by restricting each component as Cj < x^*) < 6j. When a general constraint set V is 



considered, the projection can be computed by Lagrange multipliers |32|, |33|. ■ 
Remark 2 (Examples of iterative algorithms): Different choices of the iteration mappings G{.) and 
K{.) yield different variants of the stochastic algorithm. We review a few commonly used algorithms 
in the following. 
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Stochastic projected gradient /T^, [19]: The mappings G{.) and K{.) are the gradients of the 



objective function F{.) in (|3]l. Specifically, 

]h''(tnf),h\tnf)) = TxV xF{Xn,-l,yni,h%nfT),h\nfT)) (23) 

K{xn,,yni~i]h''{tnf),h}{tnf)) = FyV yF {xn, , Vuf-i, [n fT) , {n jt)) (24) 
where and Ty are positive definite scaling matrices to accelerate the convergence. The iteration 



variables and in (21l-(22i correspond to 6^ and 6y in d3b, i.e., 9x{ns) = Xn, and 



Oy{nf) = In addition, the projection domains in (21 i-(22i are specified by 



X{y) = {e^ G M^^- : © - (|9) are satisfied}, y = {Oy e : ([TT]) is satisfied}. (25) 
Note that K{.) is a stochastic estimator of the desired gradient descent direction 



VyIE[max^F(x,2/„,_i;/i(n/T))|/i'(n/r)] (c.f. (To), Q). 

Stochastic primal-dual algorithm /[?7|/: We first form a Lagrangian function of the problem 
in 

L{6x, Oy, Xx, Xy] h) = F{9x,0y; .) — ^ Xx,i'Wi{9x, Oy] •) — ^ ^yjQji^^yi ■) 

i j 

where A^^ > and Xy > are the Lagrange multipliers for the primal variables 6x and 9y, 
respectively. Let x = {9x,Xx) and y = {9y,Xy). The mappings G{.) and K{.) for stochastic 
primal-dual algorithm are given by 



G{. 



^e^L{9x, 9y, Xx, Xy] h) 
-^kL{9x-, 9y, Xx, Xy] h) 



, and K{.) 



^eyL{9x, 9y, Xx, Xy] h) 
-^\yL{9x, 9y, Xx, Xy-, h) 



In addition, the projection domains are given by 



X{y) = M.l- xM.^ , and y 



w 



V X 



(26) 



(27) 



Under Assumption |4j the convergence of (21)-(22l can be established from standard techniques 



19| , 1 36 1 for static CSI and h\ as summarized below. 
Theorem 1 (Convergence under static CSI and h!"): Consider G{.) and K{.) are given by (23 1 



(24 1 (or (26 1), with the projection domains X and y given by ( [25] ) (or ([27])). If the step size 
sequences and satisfy '^^In, = oo,Z]/^n/ = °° ^'^'^ S 7n, < o^iS^^n/ < then the 
iteration {xn^,ynf) in (21 1-([22[) converges to the stationary point {x* {h'^ ,h}),y* {h^)). Furthermore, 



{9*{h',h^),9*y{h^)) solves the problem in Q. ■ 
However, when the CSI h^{t) and h\t) are time-varying, the above convergence is not guaranteed. 
This is because, on one hand, the inner iteration Xn^ may not converge and hence induce bias to 
the estimator K{.) for updating ynj,- On the other hand, the optimal target y*{h\t)) is time-varying 
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as well, and existing convergence results |18|, |19| fail to apply. In this paper, we shall focus on 
investigating the convergence behavior of the mixed-timescale iterations ( 2 1 1-( 22 1 when the CSI hi" (t) 
and h'^it) have mixed-timescale stochastic time- variations. 



D. Impact of Exogenous Variations and Tracking Errors 



With the presence of the exogenous variations of h{t), the mixed timescale iterations in (21 1-(22) 



are continuously perturbed and the convergence to the stationary point target {x*{h{t)),y*{h}-{t))) is 
not guaranteed. The impact of the exogenous variations can be summarized as follows, 

• Impact on the inner iteration Xn/- Since h{t) is time-varying, needs to track the time- 
varying optimal point x*{h{t)). The convergence error may depend on the relative variation 



speed between the iteration dynamics (21 1 and the exogenous variations of h(t). 

Impact on the outer iteration i/n/. Likewise, the optimal point y*{h^{t)) is time-varying, and 

Unf rnay hardly reach y*{h\t)). Moreover, the convergence error of the inner problem yields 



'Pi — 'Pi 0, which induces a bias to the estimator K{.) in (22i. 
We formally define the tracking error for the iterations ([2T])-([22|). 

Definition 4 (Tracking error): The mean square tracking error for the short-term control variable 
X is defined as 



1 

Cj; = lim sup — E I \\xiN^ — x{yi, h^^ir), h\iT) 



whereas, the tracking error for the long-term control variable y is defined as 



By = lim sup 



1 

If ^-^ L 

J 1=1 



y*{h\iT) 



(28) 



(29) 



In the rest of the paper, we shall study the tracking errors ex and Cy under the mixed-timescale 
CSI h'{t) and h}{t). 



III. Virtual Dynamic Systems for Convergence Analysis 
In this section, we derive the virtual dynamic systems for studying the tracking error of the iterations 



(21 1-(22) under time-varying CSI. We first consider a mean continuous time dynamic system (MCTS) 
which captures the mean behavior of the mixed-timescale iterations in (21 i-(22l under static hK Using 



SDE approximations, we then extend the results to consider the impact of time- varying h\t) on the 
overall tracking errors and derive the VSDS. 
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Figure 3. Illustration of the reflection term Zx{t) when the virtual state trajectory Xc{t) reaches the boundary of the 
constrain domain. 



A. Case 1: The Mean Continuous -Time Dynamic System (MCTS)for Static and Time-varying h'^{t) 

In this subsection, we consider the case for static and time-varying h'^{t). Under time-varying 
h^{t), constant step size 7 should be used in the inner iteration (21 1 to track the time-varying partial 
stationary point x{t). On the other hand, since the stationary point y*{h}-) is static, a diminishing step 



size iirif can be used in the outer iteration ( 22 1 to assist the convergence. We derive a mean continuous- 
time dynamic system (MCTS) to characterize the "mean" behavior of the algorithm trajectories for 
(|2T)-([22]). The MCTS is defined as follows. 

Definition 5 (Mean continuous -time dynamic system (MCTS)): The state trajectory of mean continuous- 
time dynamic systems (MCTS) Xc{t) and yc{t) are defined as the solutions to the following Skorohod 
reflective ordinary differential equations (ODEs) ||38|, 



xc = G{xc,yc,h',h^) + z^ (30) 
2/c = k{y^,h}) + Zy. (31) 

where x^ = ^Xc{t), ijc — ^Ucit)- The terms Zx{t) and Zy{t) are the reflection terms to keep the 
trajectories x and y inside their domains X{y) and y, respectively. The function k{.) is defined as 

k{y,h')^ Ihn E[K{x{y,h{nfT)),y,h{nfT))] (32) 

ri/— >cxD 

where K{.) is the iteration mapping specified in ( p2) ). ■ 
Note that since the short-term CSI process h'^{t) in ([T]l is ergodic and stationary, the limit in (32) 
always exists. 

Remark 3 (Interpretation of the reflection): The reflection term is the minimum force to restrict 
the trajectory inside the constraint domain. Taking Zx for example, for Xc G X, in the interior of X, 
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Zx = 0- For X E dX, the boundary of X, lies in the convex cone generated by the inward normals 
on the surface of the boundary as illustrated in Fig. [3] The magnitude of Zx is such that G{.) + Zx 
lies in the tangent of the surface. Therefore, when the trajectory reaches the boundary, it can only go 
along on the boundary. Note that Zx{t) and Zy{t) can be computed using Lagrange multipliers when 
the constraint domain are not hyper-rectangles. Please refer to Appendix [A] for a derivation of the 
reflection terms. ■ 



In the following, we illustrate the connection between the algorithm trajectory (21)-(22l and the 
MCTS ([30])-(|3T). We first derive the property of the equilibrium of MCTS. 

Definition 6 (Equilibrium and partial equilibrium): The point {x*,y*) is an equilibrium of the 
MCTS (|30l)-(|3T]l under £ V.' x UK if Xc = ijc = 0, i.e., Gixl,y*,h'',h^) + Zx = and 



k{y*,h'-) + Zy = 0. In addition, for any yc G y, Xc{yc-,h) is a partial equilibrium of the MCTS in 
(|30|l if £c = 0, i.e., G{xc, yc, h", /i') + Zx = 0. ■ 



There is a strong connection between the iteration trajectory and the MCTS as summarized in the 
following theorem. 

Theorem 2 (Connection between the algorithm trajectory and the MCTS): Assume e = (static 



long-term CSI h^). Consider G{.) and K{.) are given by (23l-(24i (or (26l), with the projection 



domains X and y given by ( [25| (or ([27])), and the step size sequences satisfy (i) = 7 for some 
small enough 7 > 0, and (ii) ^ = 00 and ^ ^u^^ < 00. In addition, the short-term CSI timescale 
is much slower than the algorithm timescale, i.e., auT ^ 7- Then the algorithm iteration trajectory 



{xn^{t), Unfit)) in (21i-(22i converges to the virtual state trajectory of the MCTS {xc{t),yc{t)) in 



30|)-([31]) in probability, i.e., for any t] > 0, 

lim sup Pr{||x„^(t) - Xc(t)|| > r/} = 0, and lim sup Pr{||y„^(i) - yc{t)\\ > 7]} = 



t—>-oo 



t— >oo 



where ns{t) = [tNg/rl and nf{t) = [t/rj. 



The theorem is established using stochastic approximation |18|, |19|. We sketch the proof in 
Appendix [B] 

From Theorem [2| the study of the algorithm convergence is equivalent to the study of the stabili 



of the MCTS. The MCTS in (30l-(31l can be viewed as the desired "mean" trajectory in the 



continuous-time counter-part of the discrete-time algorithm iterations (21i-(22i, which has filtered 
the noisy perturbation induced by the exogenous variation of h^{t) in the estimator K{.). Hence, 
the stability of the MCTS provides a necessary and sufficient condition for the convergence of the 



original iterations (21i-(22) under static h . 



''a deterministic dynamic system x — f{t,x) is asymptotically stable at the equilibrium x", if there is a 5 > 0, such 
that for any ||3::(0) — < 5, x{t) — >■ x", as t — >■ 00 |39j. 
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Note that, under Assumption |4j the MCTS (22) is asymptotically stable |39|, i.e., ydt) — )• y* as 



t — >• oo. Therefore, from Theorem |2j we have the following result on the convergence under static /i^ 
and time-varying h^{t) when the CSI timescale is sufficiently slower than the algorithm timescale. 
Corollary 1 ( Convergence under static h} and slow time-varying (t) ): For sufficiently fast algo- 



rithm iteration, i.e., uht <^ 7, the iteration {xn^[t)jynf{t)) in (21 1-(22 1 converges to {x*{t),y*) almost 
surely, i.e., — — ^ and \\ynf{t) — y*\\ —^0 almost surely, as t — ^ oo. ■ 

Fig. |4]a) illustrates the relationships between the algorithm iterations, the virtual dynamic system 
MCTS and the moving equilibrium of the MCTS for the case when both h'' and h'- are static. The 
virtual system MCTS as well as the algorithm iterations Xn^ and yn^ would eventually converge 
to the static equilibrium (x*(/i), Fig. |4]b) illustrates the case for static /i' and slowly time- 
varying /i'^(t). For variable y, there is a static equilibrium y*{h}-) of the virtual dynamic system 
MCTS, and the virtual system state yc{t) converges to the static target y*{h!'). For variable x, the 
trajectory of the partial equilibrium x{yc{t),h{t)) is driven by the time-varying ydt) and h{t), and 
it eventually converges to the trajectory of the moving equilibrium x*{h{t)), as ydt) converges to 
y*{h}). Meanwhile, the dynamics of the MCTS tracks the moving partial equilibrium x{ydt), h{t)). 



For both x and y, the algorithm iterations and yn^ in (21i-(22) roughly follow the dynamics 
of the virtual system MCTS. However, the case is different under time-varying h\t) and h^t), as 
illustrated in Fig. |4]c). The equilibrium y*{h\t)) of the virtual dynamic system MCTS moves around 
and ydt) never converges. Affected by the time-varying y*{t) and the error gap ydt) — y*{t), the 
dynamics of the partial equilibrium x{ydt), h{t)) cannot converge to the optimal trajectory x*{h{t)). 



Nevertheless, the algorithm iterations and in (21 i-(22i still follow the behavior of the virtual 
dynamic system MCTS. 

In the next section, we extend the MCTS equivalence framework to the case with time-varying 
and uht ~ 7- 

B. Case 2: Virtual Stochastic Dynamic System for Time-Varying h'^{t) and h^t) 

When h'-{t) is time-varying, the algorithm iterations {xn^,ynf) in (21)-(22i should continuously 



track the time-varying partial optimal point {x{t),y*{t)), which is a moving target as illustrated in 
Fig. |4[b). As such, we consider constant step size jn, = fJ-nf = 7 instead of diminishing step size. 

We need to first quantify the dynamics of the moving partial equilibrium {x{t),y*{t)) from the 
MCTS under the variation of Using the CSI timescale separation property e <^ an for h^{t) 

and the dynamics of the moving target {x{t),y*{t)) is given by: 

Lemma 1 (Dynamics of the moving partial equilibrium): Define G{x,y,h^,h^) = G{x,y,h^ ,h^)-\- 
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Figure 4. Illustrations of the algorithm iterations, the virtual dynamic system MCTS and the moving equilibrium of the 
MCTS. (a) illustrates the case for static h'' and h", (b) illustrates the case for static /i' and time-varying h^lt), and (c) 
illustrates the case of time- varying h'{t) and h"{t). 



Zx- The dynamics of Xc(t) and y*{t) are given by 



;33) 



dy* = ^h^{U)dh' (34) 

where = ^G{xc,yc,h'',h^), Gy{.) = ^G{xc,yc,h'',h^), Gh^{.) = ^G{xc,yc,h'',h^) and 

'^h^ = ^ipih^) as defined in Assumption |4] ■ 



19 



Proof: Please refer to Appendix [C] for the proof. 



With the notion of MCTS and the dynamic of the moving equilibrium in ( 33 1-( 34 1, the instantaneous 
tracking errors can be expressed as 



and 



Xn,{t)-Xc{t) = {Xn^(^t) - Xc{t)) + {Xc{t) - Xc{t)) 

Vufit) - y*cit) = ivnfit) - Vcit)) + ivcit) - y*it)) 



(35) 



(36) 



where '^[{t) = 



c{t)) and yj{t) = {ynf{t) - yc{t)) are the scaled error gap between 
the iteration trajectory (|2T])-(|22]| and the MCTS, and = Xc{t) - Xc{t) and yl{t) = ydt) - y*{t) 
are the scaled tracking error from the MCTS to the moving partial equilibrium ([33])-p4]). 

Even though it is very hard to quantify the tracking errors Xn,(t) ~ Xc{t) and ynf{t) ~ y*c{t)^ we 
can try to study the distributions of the decomposed error states xl^t), Xc(t), yl{t), and ylii) with 
the help of a virtual dynamic system defined in the following. 

Define a joint virtual state u{t) = (xc{t),ycit),x''^{t),y^{t),h%t)) £ M2Ar,+2Af„+Ar^ ^j^^^.^ ~^^~e ^ 



, yc,yc e 



and is a short-term virtual CSI state with initial value /i*(0) = h^{0). 

Correspondingly, define a virtual long-term CSI state h^{t) as the solution of d/i' = —j^Hi{t)dt, 
with initial value /i'(0) = /i'(0), where HL{t) is an N x N diagonal matrix, with the j-th diagonal 
element being CQiDj{t)~''~^Vj{t). We use a short hand notation G~^Ghs[.) to stand for the matrix 



G^^{xc,yc,h'',h^;t)Gh^{xc,yc,h',h^;t) from (|34|). The VSDS is defined as follows. 

Definition 7 (Virtual stochastic dynamic system (VSDS)): The virtual state trajectory of the VSDS 
u{t) is characterized by the following SDE: 



where 



\ 



du = Ui{t, u)dt + U2{t, u)dWt + dZu 

Gx{xc, yc, h'-)xc + Gy{xc, yc, /^^ h^)yc 
N-^Kx{xc, yc, ■)ixc + xl) + N-'^Ky{xc, yc, •)yc 
G(x,,y,J\V) - ^^G~'Ghs{.)h' + G-^Gyi.)N-^kiy,,h') 
N~^kiyc,h') + j^iJh^(h')HL{t) 

1 aHr Ts 
2 N.-y"' 



(37) 
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and 



U2it,u) 



On^xN 



On^xN^ 



On^xN, 







N^xN^ 



\ 



Moreover, Wt is a {2Nx + 2Ny + A^) -dimensional Brownian motion, dZu = {On^, dZx + dZy, 
G~^Gy{.)dZy, dZy, AT ) IS the reflection term, and S(yc; /i') is the covariance matrix for the stochastic 
estimator K{.) in ([22]). ■ 
As is summarized in the following theorem, the VSDS provides a weak convergence limit (con- 



vergence in distribution, c.f. |19|, |36|) to the dynamics of the error gaps xj{t), x'^{t), yl{t), and 
Vcit), when 7 — > 0. 

Theorem 3 (Algorithm Tracking Errors and VSDS): Assuming CSI timescale separation for 
and h^{t), i.e., e ^ an, the joint state (xl,yj, x^, y^, h'^) weakly converges to u{t), as 7 — ^ 0, which 
is the solution to the VSDS in ([37). ■ 
Proof: Please refer to Appendix [D] for the proof. ■ 



The above results allow us to work with the VSDS in (37 1 to study the convergence of the mixed 



timescale tracking algorithm (21 i-(22i. The VSDS provides statistical dynamics for the decomposed 
tracking error states. We formally summarize the connection between the VSDS and the tracking 



errors for the mixed-timescale iterations ( 2 1 1-( 22 1 in the following theorem. 

Corollary 2 (Connections between the VSDS and the iterations): Assuming CSI timescale separa- 
tion for h^t) and h'^{t), i.e., e <^ a^, the tracking errors for the algorithm iterations (21 i-(22i defined 
in ( |28| ) and ( |29| ) can be upper bounded from u{t), i.e., 

e^. < lim sup - max (7, 1) / E [\\xcis)f + \\xl{s)f] ds 

i— >oo t Jo 

and 

ej, <limsup -max(7,l) / E [\\y^{s)f + \\y^^{s)f] ds 
t— s>oo E Jo 

where Xc{t), xl{t), ydt), and yc{t) are the components of the joint state u{t) in the VSDS ( [37] ). ■ 
Corollary [2] can be seen from the tracking errors (35l-(36l and the results in Theorem [3] With 
Theorem |2j we can focus on studying the stochastic stability (formally defined in Section |IV-A[ ) of 
the VSDS in ([37|) in order to understand the convergence behavior of the mixed timescale iterations 



in (21 1 and (22 1. 



Moreover, the VSDS suggests that the tracking errors consist of two parts: (i) the steady state error 
7||xc|p and 7||yc|P, which arc the mean square error gaps between the iteration trajectory (x^^,?/^^^) 
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in (21 i-(22i and the MCTS {xc{t), yc{t)) in (30l-(31 1, and (ii) the mean tracking error \\x 



and 



which are the mean square distances between the MCTS and the target moving partial 
equilibrium {x{t),y*{t)) in (33l-(34i. Note that when /i' is static, i.e., HL{t) = 0, the tracking error 



Vcit) in the VSDS converges to 0, and hence, there is only steady state error 7||yc|P (due to constant 
step size) for the long-term variable y. 



IV. Convergence Analysis of Mixed Timescale Iterations 



In this section, we derive the tracking error bound of the mixed timescale iteration (21i-(22) 



by studying the VSDS obtained from Section III We first briefly review the Lyapunov stochastic 



stability techniques. By studying the stability of the VSDS, we then derive a sufficient condition for 
the convergence of the mixed-timescale algorithm. Moreover, we obtain a tracking error bound in 
terms of the parameters of the exogenous process h{t). 



A. Preliminary Results on the Lyapunov Stochastic Stability 

It is very hard to derive the exact solutions for the VSDS. Instead, we are interested in the expected 
upper bound value of the joint state which represents the aggregated tracking error Cx + 

of the iterations. This is captured by the stochastic stability in mean square defined as follows. 

Definition 8 (Stochastic stability in mean square): Given any initial state u{0) G U, the stochastic 
process u{t) is globally stochastically stable in mean square, if there exists < 5 < oo, such that 
limsupi^oo ^/oE||u(r)f dr < (5. ■ 

We use a Lyapunov method to study the stochastic stability of u{t). Define a non-negative function 
V{u) = ^u^u along the trajectory of u{t). The function has the property that V{u) ~ which 
plays the role of an energy function, where a larger ||n|| gives a larger function value. We summarize 
the main techniques of stochastic stability analysis as follows. 

Definition 9 (Lyapunov drift operator): Consider a stochastic process u{t) and a real- valued Lya- 
punov function V{u). The Lyapunov drift operator is an infinitesimal estimator on V{.) defined as 
CV{u) = lima;o -s [E [V{u{t + 6)\V{u{t)] - Viuit))] . ■ 

Lemma 2 (Stochastic Stability from Lyapunov Drift): Consider a function f{u) that satisfies f[u) > 
a||u||^ for all u eU, and some a,r > 0. Suppose the stochastic Lyapunov drift of the process u{t) 
has the following property 

CV{u) < -f{u) + g{s) (38) 

for all u ^U, where s{t) is a stochastic process that satisfies limsup4_^oo j Jq E \g{s{T))] dr < d for 

some function g{s) and d < oo. Then the process u{t) is stochastically stable, and 

1 /■* d 
lim sup - / E||u(T)|rdT < -. 
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The above result is based on the Foster-Lyapunov criteria for continuous time processes in |40| 
and is a simple extension of the results in ||24j Theorem 2]. The advantage of the Lyapunov method 
enables a qualitative analysis of the SDE without explicitly solving it. Using such a technique, we 
derive the stability results for the mixed timescale algorithm in the following. 

B. Stability of the Mixed-Timescale Algorithm 

Corresponding to stability of a random process, we define the iteration stability as follows. 



Definition 10 (Stability of the iteration): The iterations (21 1 and (22) are stable if the correspond 



ing tracking error defined in (28 1 and (29 1 are bounded, i.e., there exists a i? < oo, such that 



ex + ey < B. 



Note that due to the stochastic iterations, the tracking errors defined in (28 1 and (29 1 may be 
unbounded statistically. To study the algorithm stability, we can equivalently investigate the stability 
of the VSDS u{t). Towards this end, we first construct a Lyapunov drift CV{u) on the trajectory of 



the VSDS in (37). We then proceed to find a function g{s) that satisfies condition (38 1. Finally, we 
use Theorem |2] to obtain the stability result. 

We summarize the sufficient conditions of the stability of the mixed timescale algorithm as follows. 

Theorem 4 ( Sufficient conditions for the algorithm stability ): Assume CSI timescale separation for 
h''{t) and i.e., e ^ an- Suppose that there exist < VH,Vy < oo, such that < vh 

and < Vy. Then the sufficient condition for the algorithm to be stable is given by 

aNs (^8ax " 7^^) " 2^' " 2^^^;' > 0. (39) 

■ 

Proof: Please refer to Appendix |E] for the proof. ■ 
The above results have several implications on the convergence. 

• Convergence of the inner problem: The term Ng (sax — jfr^vj^^ specifies the convergence 
behavior of the inner problem. Recall that the parameter controls the variation speed of 
the fast changing CSI h^{t), Ox represents the convergence rate of the inner problem, r is the 
frame duration, Ng is the number of slots per frame, and 7 is the step size. As such, for a given 
an, we need to have sufficiently fast inner iterations (ax) or sufficient number of slots per frame 
(Ns) in order to have bounded tracking error. 

• Convergence of the outer problem and the coupling effect: The convergence of the inner problem 
and outer problem is coupled together. The stability of the inner problem (a positive ^otx — jfr^v'jj) 
is a premise of the stability of the whole algorithm. To achieve the stability, we desire small 
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vh and l^, which represent the sensitivity of the partial stationary point x w.r.t. the change of 
h'^{t). On the other hand, we also want the parameters ly and Vy to be small, which means that 
y* shall not be quite sensitive to the bias induced by the tracking error — x{ynf,-)- 
• Impact from the iteration timescale: One can reduce the frame duration r, increase the number 
of slots Ng per frame, or increase the step size 7 to enhance the stability of the overall algorithm. 
However, shortening the frame duration r may result in a larger amount of signaling overhead 
to update the long-term variable yn; and the acquisition of local CSI h^{nfT); increasing the 
number of slots Ng yields a higher computational complexity; and moreover, a large step size 
7 may give larger steady state error 0(7) for the discrete-time trajectory. 



C. Upper Bound of the Tracking Error 

Stability is only a weak result of convergence. We are interested in the tracking error bound of the 



algorithm. Under the sufficient condition specified in (39 1, using the Lyapunov technique in Lemma 
[2j we study the result on the upper bound of the tracking errors Cx and Cy. 

Theorem 5 (Upper bound of the tracking error): Assume the conditions in Theorem |4] If S = 
limsup(_^oo i Jo ^ '^'^ < the tracking errors and Cy are given byi 

ex + ej, < ^ (r S + C) (40) 

where p = {Nsa^a^ay), rj = O (^^/N^al + a^y C = ^N{1 + vjj) + 0(62^^27-2), and N 
is the dimension of the CSI vector h^. ■ 
Proof: Please refer to Appendix [F] for the proof. ■ 
The above result shows that the upper bound of the tracking error depends on several important 
parameters, such as the timescale parameter an for h^{t), the sensitivities vh and Vy of the stationary 
points X and y*, respectively, as well as the sensitivities zu of y*{h}-) over hK A faster time- varying 
scenario corresponds to larger qh, which result in a larger tracking error bound. In addition, we can 
observe the foUowings. 

• Special case for static and h'^: Under static CSI, where the CSI timescale parameters an = 
e = 0, we have the term C = in the error bound. The tracking error is governed by the steady 
state error ^rS due to the constant step size 7. Note that if diminishing step size is used for the 



outer iteration ( |22| ), i.e., — )• 0, the tracking error bound in (40 1 becomes 0. This corresponds 
to the case in Fig. |4]a). 

Special case for static and time-varying h^{t): Under static h'-, where long-term CSI timescale 
parameter e = 0, the term C decreases, because the term 0(e2ti72r27-2) becomes 0. In particular. 



if diminishing step size is used for the outer iteration (22 1, i.e., — 0, the error bound becomes 
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and is mainly contributed by the tracking error in the inner iteration. This corresponds to 
the case in Fig. |4]b). 

Impact of the algorithm parameters: When the CSI is fast changing, i.e., the CSI 

timescale parameters au and e are large, one can reduce the frame duration r, increase the 
number of slots Ng per frame, or increase the step size 7 to reduce the tracking error, with the 
price of larger signaling overhead, higher computational complexity and larger steady state error 



0(7) as discussed in Section IV-B 



V. Compensation for Mixed Timescale Iterations 



In the previous sections, we have analyzed the convergence behavior of the iteration (21i-(22) 
under mixed timescale time-varying CSI h{t). In this section, we shall enhance the algorithm for 
better tracking performance. Since the convergence of the outer long-term variable y depends on 
the convergence of the inner problem, it is essential to accelerate the convergence of the short-term 



variable Xn,- Towards this end, we introduce a compensation term in the algorithm (21 1 to offset the 



exogenous disturbance to the VSDS in ( 37 1 



A. Adaptive Compensations for the Time-varying CSI 

Recall that in the mixed timescale iterations ([2T])-(22 1, the inner iteration tracks the moving target 



x{t) driven by the time-varying h{t) and y{t). On the other hand, the outer iteration tracks the 
moving target y*{t) driven by As a result, when h{t) and y{t) are time-varying, they generate 

disturbance to the tracking iterations ( [2T] ) and (22 1. This can be seen from the dynamics of the error 



states xl{t) and y^{t) in the VSDS in (37 1 (c.f. equations (75 1 and (76 1) 



dxl = G{xc,yc,h',h')dt +G-^Gh^{.)dh' +G-^Gy{.)dy, (41) 

' ^ ' ' 

exogenous disturbance from (t) distrubance from y{t) 

dyl = N;^k{y,,h')dt -^Ph'ih^)dV . (42) 

^ V ' 

exogenous disturbance from (t) 

Note that, from Theorem [2] and Corollary [l] when dh'- = 0, the error dynamic system (42) is 



asymptotically stable and the error state y^{t) converges to 0. In addition, when dh^ = 0, the system 



(41 1 is stable and the error state xl{t) converges to as well. However, with the presence of the 
exogenous disturbance dh'^ and d/i', the error states and yc{t) are continuously disturbed and 

may fail to converge to the origin. This is illustrated in Fig. [5] 

Such observation motivates us to offset the exogenous disturbance to reduce the tracking error under 



time-varying and h^t). We start by introducing compensation terms to the MCTS (30l-(31l as 
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/ / 



(b) 



idisturbanc e from h'(t) 



the "mean" descent 
direction 
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Figure 5. Illustrations of the convergence of the iterations {xn^,ynf) and the virtual error dynamic systems Xc{t) and 
yt{t). (a) The algorithm iteration Xn^ tracks the target x{y{t),h{t)), which is moving due to the time-varying y{t) and 
h{t). (b) The corresponding error state x^{t) approaches to under the mapping G{.). However, the exogenous disturbance 
Gx^Gh^ {•)dh" and G^^Gy{.)dyc drag it away from the origin, (c) The algorithm iteration y^^ tracks y*{h^{t)) following 
the "mean'Virtual trajectory ydt). (d) The corresponding error state yc{t) approaches to following the virtual direction 
fe(.). However, the exogenous disturbance ipf^i{h})dh} drags it away from the origin. 



follows, 



dxc = G{xc,Vc,h',h^)dt -G^^Ghs{.)dh' -G^^Gy{.)dyc (43) 

' ' ' 

compensation for h=^{t) compensation for y{t) 

dye = N-^k{y„V)dt +i:h^(h})dV (44) 

' 

compensation for {£) 



where —Gx^Gfi^{.)dh'' and —Gx^Gy{.)dyc are compensation terms to offset the disturbance in (41 



and 'il)iTi{h})dh} is a compensation term to offset the disturbance in (42). Here, for a simple discussion, 
we drop the reflection terms dZ^ and dZy, since the projections that drive the reflections are always 
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conservativerl As a result, the dynamic system in (41 i-(42i becomes 



Gixc, Vc, /^^ h')dt + G-^Gh^ (.) - G^^G^. (.) dk 



(45) 



+ (^G-'Gy{.) - G?G^(.)) dy, 

N-^kiy„V)dt+(i^h'{y)-i^h'(h') ] dV. 



(46) 



Define the disturbance components as </9^(.) = — G^r^G^s (.), = — G~^Gj^(.) and ^y{-) = 

iph'i^)- Consider (^^(x;.), ip%{x;.) and ipy{y;.) as the compensation estimators for the disturbance 
components (Pxi-), ^x{') and The corresponding compensated mixed timescale algorithm is 

given by, 



(47) 



(48) 



a;n,-i + ^G{Xn,-l,ynf;h''{nfT), h\nfT)) 

= T^i' + 7i^(x„^,y„^_i;/i''(n/r),/i'(n/T)) + (^()(y„^_i;.)(A/i')„^ 

where (A/i^)„^ = /i^(L^Jr) - /i^(L^Jr), (Ay)„^ = y^^j - y^^j, and (A/i^n, = /i'(n/r) - 
h}{nfT — t). The compensation terms are non-zero on the frame boundary. 

B. Derivation of the Compensation Estimators 

Obviously, if we can precisely estimate the disturbance, its impact to the convergence can be totally 
suppressed and the tracking errors and go to zero. Unfortunately, we cannot obtain perfect 
estimations of the disturbance terms — G~^Gh=(.), —G'^^Gy{.), and ^/^/^((/i'), because they require 
closed form expressions of the target {x{y,h'^ ,h}),y*{h^)). In this section, we derive approximate 
compensation terms using Lagrange duality theory |[32|, p3| and implicit function theorem in calculus. 



Q{Ox, Ax; By, h'^, hi") 



1) Compensation for the Short-term Iteration Xn, ■' Consider the optimality condition |33| for the 
inner problem (|7]), 

' V,F{e,, Oy- /l^ h^) - Zi Xx,iVW^{0,, Oy- .) 
{^X,iWi{9x, By] 

where Xx-i > and Wi{9x,0y].) < 0, for all 1 < i < W. From the Lagrangian duality theory, 
g{x]ey,h',h^) = has a unique solution x = 



X, Xx) for Xx > 0, and the dynamics of x{t) should 
-jt -r yftn-^) V"ar ^ ^yy-^TJUr ~ Suppose the matrix Qx{x;.) is invertible. Using 
the implicit function theorem, the compensation estimators can be given by 



satisfy Gxix; .)f + OIf + ^yi^'^ 



(49) 



'The reflection is from the constraints that form the convex domain X{y) x 3^. As the constraint domain is to restrict 
the iteration trajectory, it always helps the convergence. 
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2) Compensation for the Long-term Iteration yn-' From the Lagrange duality theory, the optimaUty 



condition for the outer problem ( lOl is given by 

T{6y,Xy;h) = =0 (50) 

)}i=i 

where Xyj > and qj{9y;.) < 0, for all I < j < J. Similarly, T(y; .) = has a unique solution 
y* = (61*, a;) for a; > 0, and the dynamics of y*{t) should satisfy Ty{y*;.)^ + Th'iy*; 0^ = 0- 
Also, using implicit function theorem, an ideal compensation estimator can be given by —Ty'^Th' (y; ■)■ 

Note that closed form expression is usually not available for T{y; .), due to the expectation 

VyF{e^, By; h% - ^yJ^QjiOy; .) 



EF(.). Alternatively, define T{Oy, Xy; h' , 



{Xyjqj{ey;.)y 



. Then 



T{y;.). Similarly, fhiiy,' 



for given /i', 7^(y; .) is an unbiased estimator of 7^(y; •), since ]ET(y; . 
is a unbiased estimator of Thi {y, ■)■ As a result, the compensation estimator for the long-term iteration 
can be given by 



^'^{y;.) = -Ty-^Th'{y;.) 



(51) 



C. Performance of the Compensation Algorithm 



Although the compensation estimators derived in ( 49 1 and ( 5 1 1 are from approximation, we can 
show that under some technical conditions, the tracking errors and y^ from continuous-time 
trajectories Xc{t) and yc{t) can be significantly eliminated. 

Theorem 6 (Tracking performance of the compensation algorithm): Assume CSI timescale sepa- 
ration for h\t) and h'^{t), i.e., e ^ an- Suppose that (^^(•), 'f^i-) and 'py{y; .) take the forms in (49l 

; cons 

< Lii\\x — x\\ and E||(^(*(y; .) 



and \5\\, and they are Lipschitz continuous, i.e., there exist positive constants L^, L%, Ly, I3y < oo, 



\0y{x;. 



^y{x; 



such that \\0'^{x;.) — (p^{x; .)|| < L^\\x — x 
^y{-)\\ < L^Wv -y*\\+ Py, for all x G X{y), y£y. Then, if 

(i) ay 7 — erLy > 0, and 



(ii) Ox 



2-kNsJ 



~ L%Ng ly- 



Oy'y — erL'^ 



> 0, 



the tracking error converges to in probability, and the the tracking error for y^ is upper bounded 

by nmw < 



Proof: Please refer to Appendix [G] for the proof. ■ 
In the case when is static, the convergence result is given in the following corollary. 
Corollary 3 (Tracking performance under static h}): Suppose and are given by (49 1 

and Lipschitz continuous as specified in Theorem |6j Then if ax — ^/^^ ^ I^x ~ L%N~^l 



l,y 



ayj-erL^ 



V^] > 0, the tracking error x^ converges to in probability. 
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Corollary [3] is obtained by setting the timescale parameter e = in Theorem [6] It corresponds to 



the case 1 scenario studied in Section III-A However, the performance with compensation is stronger 
because it does not require the short-term CSI timescale to be extremely slower than the algorithm 
timescale (i.e., qht <^ 7) in Theorem [2] and Corollary [T] for the convergence. 

Remark 4 (Interpretation of the results): Theorem [6] and Corollary [3] show the performance ad- 
vantage of the compensation algorithm under time-varying CSI. Specifically, we have the following 
observations. 

• Compensation for the long-term control: When the bias [3y of the compensation estimator goes 
to 0, the tracking error y'^ = yc — y* of the long-term variable y converges to as well. Note 



that the bias comes from using VyF{.) to estimate VyEF(.) in (50 1. To reduce the bias, we 
can use a Monte-Carlo method to estimate Vj^EF(.) = X]m=i yo)> by observing 

many realizations of V yF{x{t) , y{to); .) in the inner timescale. 

Compensation for the short-term control: Theorem [6] implies that when the short timescale CSI 
h^{t) does not change too fast (moderate an), the compensation algorithm can keep track with the 
stationary point target. This is a much weaker condition for the conventional convergence result, 
which requires qht ^ 7, i.e., the algorithm must iterate much faster than the CSI dynamics. 
Moreover, one can reduce the frame duration r, increase the number of slots Ns per frame, or 
increase the step size 7 to satisfy condition (ii) for enhancing the tracking of the inner iteration 
Xn^ , at the cost of larger signaling overhead, higher computational complexity and larger steady 



state error 0(7) as discussed in Section IV-B 



Note that, even though there can be zero convergence errors for Xc and yc in (43 1 and (44), the 



discrete-time iterations (47) and (48) still have 0(7) steady state error due to the constant step 
size 7 used for the tracking. Nevertheless, Theorem [6] and Corollary [3] indicate that the proposed 
compensation algorithm has an eminent convergence capability under time- varying h'^{t) and 

VI. An Application Example: Resource Allocations in Wireless Multi-hop Relay 

Network 

The mixed timescale optimization approach has vast applications in wireless communication net- 
works. In the following, we consider a particular example of joint flow control and power allocation in 
wireless relay network described in Section [lPB From this example, we demonstrate the compensation 



algorithm and apply the theoretical results for the convergence analysis. 
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A. The Two-Timescale Algorithm 

We apply the stochastic primal-dual method in ( [26| ) to derive the iterative algorithm Xn, and ynj 
in this example. Denote x = (p,A), where A = (Ai,...,Avk) is the Lagrange multiplier. For the 



example problem in (13 1, we can form the Lagrange function as 

J 

L(p, A,r;h) = ^log(rj) - V^pk-^\j 



Cfc(r)-log 1+ J] 



keSj 



keSj 



(52) 



je£ kec j=i 

where W is the total number of constraints, and Sj C C'^{m) for some m £ TZU {0}. 

1 ) Iteration for the short-term variable: The optimality condition (KKT condition | |4T| ) for the 
inner problem is given by 

|;^(P,A,r) 



a(p, A;r,h) 



A. 



0. 



9 M 1 ^ 

Efc65, Cfc(r) - log [I + Ekes, \hkrPk)\ \ .^^ 
Following the adaptive compensation algorithm in Section [Vj the iteration of the short-term variable 
is given by 

(53) 



p(n, + 1) = 
A(n, + l)=Vx 



pins) + ^Q^^ {p{ns), A(n^); r(n/)) + ^'p(.) 
d 

Hris) - -f^L (p(n,), A(n,); r(n/)) + ^!x{.) 



(54) 



where the projection Vp (■) and V\ (.) are to restrict the elements to be non-negative. The term 
(t^L(.), t^L(.)) corresponds to the iteration mapping G{.) in (21 1 (and (47 1). The compensations 
^p(.) and 'I'a(.) can be derived as 



^'a(.) 



^(p!A)^h(p(ns),.)Ah(n,) - g-i^^g,((p(n,),.)Ar(n,) 



(p,A)- 



where Ah(n,) = h(L^Jr) - h(L^Jr) and Ar(n,) = r(L^J) - r(L^J). 

2) Iteration for the long-term variable: We first derive an augmented Lagrange function Li{.) by 



substituting the equality constraints (20 1 into the Lagrangian L{.) in (52). The optimality condition 
for the outer problem is given by 

r(r; h') = |-ELi(p(n,), A(n,), r(nj)) = 0. 
or 



The update of the long-term variable is given by 



r(n/ + 1) = 



d 

r(n/) + (p(ns), A(ns), r(n/)) + 



(55) 



where the projection Vr (■) is to restrict r to be non-negative. The iteration (55 1 corresponds to the 
long-term variable update for in (22), and the term (p(ns); A(ns), r(nj)) corresponds to 
the stochastic estimator K{.) in (22). The compensation ^r(0 can be derived as 
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Components in the example 


Corresponding components in the model 




Primal-dual inner iteration 


53 


-i54| 




Short-term iterative sequence 1 


21 




r(n/) 


Primal outer iteration i 


55 






Long-term iterative sequence i 


22 




(*p(-),*a(.)) 


Compensation for the iimer iteration 


-ip^dh" - tpldy 


Compensation for the inner iteration 




Compensation for the outer iteration 


0'^dh' 


Compensation for the outer iteration 


R^' X Rf 


Projection domain for the inner iteration 


X{y) 


Projection domain for x^^ 




Projection domain for the outer iteration 


y 


Projection domain for 



Table II 

Algorithm associations between the example and the mixed timescale system model. 



where T(r;h') = ^Li(p, A, r; h') and the path loss variable h'(t) can be measured by averaging 
the CSI h(t) over a certain time window. 

Table [n] summaries the algorithm association between the example and the mixed timescale model. 

B. Implementation Considerations 

With the iteration timescale decomposition, we can consider two implementation scenarios of the 
two-timescale algorithm: a) distributive implementation, and b) hybrid implementation. 

Under distributive implementation, at each frame, each BS and RS node m G 7^U{0} acquires the 
local CSI {hj}j^c+{m) and exchange the local control variables {pj}jec+{m)^ {■^1™''} ^^d {rj}j^c+{m) 
with neighbor nodes. It then updates the long-term flow control {rj}j^c+{m) once according to the 



outer iteration (55), and updates the power control variables {Pj} j<^c+(m) ^^d {A.™^} in each time 



slot according to the inner iterations (53 1 and (54i. As an illustrative example. Fig. |6] a) demonstrates 



the message passing under distributive implementation and the network topology in Fig. [T] 

Under hybrid implementation, there is a RRM server coordinating the message passing and the 
outer loop iterations in the network as illustrated in Fig. [1]. At the beginning of each frame, each BS 
and RS node m ^ TZU {0} obtains long-term flow control {?'j}je£+{m) from the RRM server and 
acquires the local CSI {hj}j(zc+{m)- It then updates the local power control variables {Pj}jeC+{Tn) 



and {A^™^} according to the inner iterations (53 1 and (54i in each time slot within the frame. At the 
end of the frame, it passes the local variables {pj} and {X'f"^} together with the local CSI {hj} for 
j G C^{m) to the RRM server. By collecting the short term variables p and A as well as the global 



CSI h, the RRM server updates the long-term flow control r using the outer iteration ( 55 1 and feeds 
back to the BS and RSs at the beginning of the next frame. Fig. [6]b) illustrates the message passing 
under hybrid implementation and the network topology in Fig. [T] 



Note that as the inner iterations ( 53 1 and (p4| require only local CSI, they can be iterated for a 



finite number of steps A'^g > 1 at each frame to catch up with the fast timescale CSI variations. On 
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inner iteration;' (/?,,/7f,,A"")'i {iPi^P^,^'''^) 
in slot scale (p,^Pf,^^^ ) 



BS 

(Node 0) 



outer iteration 
in frame scale 



RS 1 

(Node 6) 



RS2 

(Nodes) 



(a) Distributive implementation 



RRM Server 



Outer iiteration 
in frame; scale (56) 





___ _ 














(^,^) 






Inner: iteration 



BS 

(NodeO) 



RS 1 

{Node 6) 



RS2 

(Node 5) 



in slt)t scale 
(54) -(55) 



(b) Hybrid implementation 
Figure 6. Example algorithm implementations and message passing for the network topology in Fig. [T| 



the other hand, since the outer iteration (55 1 requires global coordination which involves signaling 
latency, it can only be updated once at each frame. However, as the long term flow control r adapts 
to CSI statistics (i.e., the long term CSI h'), it does not require a fast iteration and is not sensitive 
to signaling latency. 

Considering the computational complexity, the two-timescale algorithm with variable partitions 
reduces the computational cost at the central controller by distributing the computation of the cross- 



layer network utility optimization to different nodes locally. Table III gives a comparison on the 
computational complexity in terms of CPU time over one frame under the example in Section II-B 
The inner iteration is assumed to update for Ng = 30 steps in each frame under two-timescale 
algorithms. The one-timescale centralized algorithm consumes more CPU time, which may not be a 
good choice, since the transmission power control is delay- sensitive. 

C. The Convergence Analysis 
In this subsection, we derive the upper bound of the tracking error of the two-timescale iterations 



(53l-(55l using the theoretical results developed in Section III We derive the convergence rate 
parameters ax and Uy as follows. 

Theorem 7 (Local convergence speed): Denote 



Ml{.) 



9p9p ^'1 dpdX 
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RRM Server (ms) 


Each BS (RS) (ms) 


Two-timescale distributive algorithm 




0.601 


Two-timescale hybrid algorithm 


0.0511 


0.533 


One-timescale centralized algorithm 


1.26 





Table III 

Computational complexity in terms of CPU time of the one-timescale centralized algorithm and 

TWO-TIMESCALE ALGORITHMS OVER ONE FRAME. THE SIMULATION WAS DONE ON A MATLAB PLATFORM RUNNING 
ON A DESKTOP COMPUTER WITH A 2.8 GHZ SINGLE CORE CPU. 



Then Ml{.) is negative definite for all (p*, r*) and A*(p*), under all h. In addition, given any optimal 
points w = (p*,r*), we have the local convergence rate ax{y^) > — Amax(5(-^^L(w) + M'[{-w))), 
'^yi'^) = ~^max (^afip^(w)^, where Amax(^) denotes the maximum eigenvalue of matrix A. ■ 
Proof: Please refer to Appendix [H] for the proof. ■ 

Using Theorem [v] a lower bound of the global convergence rate can be obtained ax = 
mf{ax{uj) : a; e X{y) x 3^, Vy G 37, Vh} and ay = inf{a^(cj) : u € X{y) x 3^,V?/ G 3^, Vh}. 

Given the results in Theorem [7j the the condition for algorithm stability and tracking error bound 
then directly follows from the results in Theorem |4] and |5j respectively. 

Note that we can always enhance the convergence and increase Ox and ay by introducing a carefully 
chosen positive definite scaling matrix T in the iterations. However, the computation of the scaUng 
matrix may increase the complexity for the inner iteration and require additional signaling overhead 
for the outer iteration. 

VII. Numerical Results 
In this section, we simulate the tracking performance of the mixed timescale algorithm for the 



example cross-layer stochastic optimization problem studied in Section II-B and Section VI We 



demonstrate the performance advantage of the mixed timescale algorithm over one-timescale algo- 



rithms under the CSI model discussed in Section II-A2 In addition, we show that the proposed 
two-timescale compensation algorithm in Section [V] significantly reduces the tracking error under 
time-varying CSI. 



We consider the wireless heterogeneous relay network described in Section II-B| Specifically, the 



network has 1 macro BS, 2 RSs and 4 mobile users who want to transmit data flows to the macro BS. 

^In fact, the domain X{y) x 3^ may not be compact, and (or Oy) may then be degenerated. However, in practice, 
the control variables x and y (corresponding to power and flow data rate here) cannot go unbounded. Therefore, one can 
identify a confident domain X{y) x y (Z X{y) x y and H (Z H, which are compact, to estimate the lower bound of the 
convergence rate (or ay). 
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The BSs are static and the mobiles are moving around with a speed at most Wmax = 100 km/h. The 



mobility is according to the Levy walk mobility model in Section |II-A1 with parameter Dmin = 75 



m, i = 1.8 and cq = i^min- There are 6 wireless links as illustrated in Fig. [T| and it is assumed 
that the network topology does not change during the simulation. Correspondingly, the long-term 
CSI timescale parameter is e = 6 x 10^^ sec^^. The control objective is to determine the flow rate 



congestion control r and power allocation p according to the proportional fair utiUty in (13 1. The 



frame duration is r = 1 ms, and the inner iterations (53 1 and ( [54| ) are updated for Ng = 30 steps in 
each frame. 

We consider the following baseline schemes: 

• Baseline 1 - One-timescale centralized algorithm based on real-time global CSI ||T|, ||T2|: 
The central controller (RRM server in Fig. [T]) solves the deterministic version (dropping the 



expectation) of the problem (13 1 at each time slot. The controller collects real-time global CSI 
(GCSI) h(t) at each time slot and computes the optimal flow rate congestion control r(h(t)) 
and power allocation p(h(t)) that adapt to each realization of h(t). 

Baseline 2 - One-timescale centralized algorithm based on statistical CSI |[T5j: For every 
Tg = 100 ms, the central controller solves a relaxed version of the stochastic optimization 



problem (13 1, where the link capacity constraint (14i is replaced by the probability outage 
constraint Pr [( [141 ) is not satisfied] < Gout> and the flow rate congestion control r and power 
allocation p adapt to the statistics of the GCSI h(t). 
• Baseline 3 - Two-timescale stochastic gradient without compensations: The algorithm itera- 
tions are based on stochastic gradient in ( [23] ) and ( [24| ) in solving Problem 4. 
Note that the baseline 1 suffers from huge computational complexity, as it searches for the optimal 
solution at each time slot, which is not scalable to large networks. Moreover, baseline 1 is very 
sensitive to signaling latency for the message passing throughout the networl?!^ On the other hand, 
baseline 2 is not sensitive to the signaling latency but it is too conservative as it does not exploit 
the local real-time CSI knowledge at the BS and the RSs. Hence, baseline 1 and baseline 2 are for 
performance benchmark only. 

A. Performance of the Mixed Timescale Algorithms 

Due to the exogenous stochastic variation of h'^{t) and h\t), the instantaneous link capacity 
constraint in ( 14 1 and ( [T5[ ) may not be satisfied for every iteration outputs. To quantify the associated 



^In the current practical communication networks, sucii as LTE, tiie backhaul latency is typically around 10-20 ms 
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Figure 7. The constraint outage probability under different CSI fading parameters an and e = 6 x 10 The signaling 
latency is r = 5 ms. 



performance penalty, we define the constraint outage probability as follows 

= ]r EE 1 {"^- ^ C(p(n),h(n)),Vm : j € £+(m)} 

^ n=l je£ 

where Nt is the total number of transmission frames, !{.} is the indicator function, and Cm^{p,h.) 
is the multi-access channel (MAC) capacity region at receiver node m, and is specified by ( [T2] ). 

Note that, an = 50 corresponds to around 10 ms channel coherence time | [3T| and an = 1 yields 
over 200 ms channel coherence time. Fig. |7] shows the constraint outage probability under different 
CSI fading parameters an and e = 6 x 10~^. The constraint outage probability increases when 
the channel is changing faster, but the proposed two-timescale compensation algorithm has the least 
constraint outage probability compared with other baselines under 5 ms signaling latency and various 
channel fading rates. 

Fig. [8]a) gives the throughput performance assuming no signaling latency. Baseline 1 yields the 
best performance, but it is highly sensitive to signaling latency, as shown in Fig. [8]b), where 5 ms 
signaling latency is considered. In Fig. [8]b), as the CSI varies faster, the throughput performance of 
all the schemes decrease, except for baseline 2. However, baseline 2 does not exploit the short-term 
transmission opportunity and achieves only moderate performance. As a comparison, the proposed 
two timescale compensation algorithm has the best performance and is robust to signaling latency. Fig. 
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(a) Assuming no signaling latency (h) t — 5 ms signaling latency 

Figure 8. Throughput performance of the different schemes under 6 = 6x10"* and average SNR 11 dB. Note that, 
uh = 50 corresponds to around 10 ms channel coherence time [31] and an ~ 1 yields over 200 ms channel coherence 
time. 




Figure 9. Proportional fair utility of the different schemes under 11 dB average SNR, 5 ms signaling latency and the 
long timescale CSI parameter e = 6 x 10~*. The proposed algorithm performs much better than all the baseline schemes. 



[9] demonstrates the corresponding proportional fair utility for the different schemes under signaUng 
latency of 5 ms. The proposed algorithm performs much better than all the other schemes. 
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Figure 10. A snapshot of algorithm trajectories of the proposed compensation algorithm and the stochastic gradient 
algorithm without compensations under short timescale CSI fading parameter an ~ 10 and long timescale parameter 
e = 6 X 10~^. The trajectories represent the online power allocation policy ps. The proposed compensation algorithm 
quickly converges to the optimal trajectory of the inner iteration, while the baseline algorithm fails to track the optimal 
target and yields much larger tracking errors. 



B. Tracking Performance of the Adaptive Compensation Algorithm 

We evaluate the tracking performance of the two-timescale compensation algorithm over the base- 
line stochastic gradient algorithm. 



Fig. 10 shows a snapshot of the algorithm trajectories of the proposed two-timescale compensation 
algorithm and the baseline stochastic gradient tracking algorithm without compensations, under short 
timescale CSI fading rate an = 10 and long timescale parameter e = 6 x 10~^. The trajectories 
represent the online power allocation policy p^. The proposed compensation algorithm quickly con- 
verges to the optimal trajectory of the inner iteration, while the baseline algorithm fails to track the 
optimal target and yields much larger tracking errors. 

VIII. Conclusions 

In this paper, we have analyzed the convergence behavior of a mixed timescale cross-layer stochastic 
optimization driven by multi-timescale CSI. The CSI dynamic is modeled by an auto-regressive 
process in the short timescale (small-scale fading), and a mobility-driven dynamic process in the long 
timescale (large-scale fading). We partitioned the control variables into short-term control variables 
and long-term control variables, and studied the convergence of the corresponding mixed timescale 
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Stochastic iterative algorithm. We derived a VSDS and showed that studying the algorithm conver- 
gence is equivalent to studying the stochastic stability of the VSDS. Using Lyapunov stochastic 
stability analysis, we derived a sufficient condition for the algorithm to be stable. In addition, we 
derived a tracking error upper bound in terms of the parameters of the mixed timescale CSI process. 
Based on these results, we proposed an adaptive compensation algorithm for enhancing the tracking 
performance. The analysis framework and the proposed algorithms were applied to an application 
example in a wireless heterogeneous network. Numerical results matched with the theoretical insights 
and demonstrated significant performance gain of the proposed compensation algorithms over the 
baselines. 

Appendix A 

Derivations of the Reflection Terms and Zy 



Taking a small step At, the ODE dynamics (30i-(31 1 can be written as 



yc{t + At) = y^{t) + k[y^{t),h}{t))At + Zy{t)At 
= Vy[yc{t) + k{yc{t),h\t))/\t . 

Consider that the convex domains X{y) and 3^ can be specified by a set of constraints uji{x, y; h) < 0, 
i = 1,...,W, and qi{y]h^) < 0, i = I,..., J, respectively. Then the Euclidean projections are 
equivalent to find the points xo(At) and yQ{At), which solve the following minimization problems 

mill Ux-{xcit) + G{xcit),yc{t),.)At)\\l (56) 

X 

subject to ooi{x, y;.) <0, \/i = 1, . . . ,W. 

and 

min l\\y-{y^{t) + k{y,{t),.)At)\\l (57) 

V 

subject to qi{y; .) < 0, Vi = 1, . . . , J. 
The corresponding Lagrange functions are given by 

l(-)(x, y, A,) = 2 11^ - [xc{t) + At] g + ^ A,, 

i=l 

and 

1 

L^'Hy, A,) = ^\\y- [ydt) + k{.)At] \\l + J2 KMv^ ■)• 



w 

^i{x,y] 



i=l 
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The KKT condition |32|, |33| for the problem (56 1 on the x variable is given by 

w 

xo{At)-[x,{t) + G{x,{t),yc{t),.)At] + ^Xl^^V^Ui{xo{At),yc{t);.) = (58) 



i=l 



X*Mxo{At),yc{ty,.) = 0, Vi. (59) 



Solving (58l-(59l, we obtain xo{At). Similarly, by writing the KKT condition for (57 1, we can obtain 
2/0 (At). Then the reflection terms are given by 

xo{At) - [xc{t)+G{xc{t),yc{t),.)At] 



Zx{t) 



lim 



At 



and 



Zy{t) 



lim 

At^O 



yo(At)-[j/,(t) + fc(j/e(t),.)At] 
At 



Appendix B 
Sketch Proof of Theorem|2] 

We ignore the transient states for and y^, and just focus on their stationary states. 



For the inner iteration Xn^ in (21 1, consider a large enough n^. Since is decreasing, we 
have ^ 7. From the timescale condition a/^r ^ 7 and the step size condition ^ 7, 
the iteration (21 1 finds the partial optimum £(y,/i(njr)) for each h{t) = (/ii(t), . . . ,hN{t)), where 
hj{t) = h''jhj{t). This can be shown under Assumption pi and H given a sufficiently small step size 7 



1 32 1, |33|. Note that the partial stationary point x(y,/i(njr)) of ( [2T[ ) is also the partial equilibrium 
point Xc(t) of (30l. Therefore, we have established that Xn^(t) ~^ Xc{t), where ns{t) = [tA^^/rJ. 
For the outer iteration in (|22|), with sufficient large nj, we have 



E 



( j;„^ _ 1 , „ 1 ; /i" (n/r ) , /i' I /i' 



TyVyF{x{ynf~i,.),ynf~i; h\nfT), h^)\h^ 

TyVyVl{y,h\h^)\h^ 



VyVi{y,io,h^)dFh4io) 

TyVyj Vl{y,iO,h^)dFhs{0j) 

Vi{y,h\h')\h'^ 



TyVy-K 



Kiy,h' 



(60) 



where Fh={uj) is the cumulative distribution function of the conditional probability Fr{h^ = tj|/i') = 
Pr(/i* = io), and the interchange of the integration and the differentiation is because the integration 
is bounded (i.e., the expectation of F{.) is bounded). We take conditional expectation here because 
y adapts to each realization of hK Therefore, we have the gradient estimator K{.) = K{y, /i') + ^(.), 
where ^(.) is some "noise" and IEC(.) = 0. Using the stochastic approximation | [l9} , ||36|, yn converges 
to y* almost surely under the assumed step size rule for 
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From Assumption |4j the matrix Vyk{y, /i') = Vy'K[K{.)] = E [VyEr(.)] is a negative definite ma- 
trix. Therefore, the system ijc = yyk{yc, h}) is asymptotically stable |j39| at a unique stationary point 
y*{h}), where y*{h'-) = Vyk{yc, h^) = 0. According to the definition of k{y, /i'), we have K{y^ h^) = 



k{y, h). This implies that the stationary point y„ of (22i is just the equilibrium y* of (31 1, i.e., y* = y* 



Then we have established the asymptotic result for limsup(_j.oo Pr {||y„^(f) — 2/c(i)|| > f?} = 0, where 

nf{t) = [t/T\. 

Appendix C 
Proof of Lemma[T] 

The above results are obtained from the implicit function theorem. From the optimality condition 
G{xc, yc, h^, h^) = and the implicit function theorem, we have 

Gh^{xc{yc,-),yc,h\h^)dh' 



dxciyc,-) = {xc{yc,-),yc,h'',h) 

+Ghi {xciVc, •),yc, /l^ h'')dh^ + Gy{xc{yc, ■),y, h', h})dy . 

But since dh} <^ dh^ due to the small variation of h^t) (controlled by e ^ an), the term Ghi{-)dh^ 
is comparatively small. Ignoring this term, equation ([33]) yields. 



Appendix D 
Proof of Theorem[3] 

Theorem [3] can be obtained from the weak convergence results in |19|, |36|, |43|. In the following, 
we sketch briefly how we can apply those results. 

Recall that the algorithm is implemented on the timescale = njr, where nj is the frame index 
and T is the frame duration. To establish the VSDS, we define the virtual algorithm timescales as 
follows. 

Definition 11 (Virtual algorithm timescale): The, virtual algorithm, timescale on frame is SLma'p^mg 
from the frame index nj to a real number s„^. = njNs^. The virtual algorithm timescale on slot is 
a mapping from the slot index to a real number Jn, = '^sT- ■ 

Under the virtual algorithm timescale, the iteration indices are related as ns{s) = [s/^\ and 
nf{s) = [s/{Ns'y)\. The virtual algorithm times is just a scaled implementation time, and their 
relationship is given by tn^ = Nsup^ = Snj^. 

Accordingly, denote the virtual CSI state as h^{s). Since dtnf = j^dsnf, the timescale of the 
virtual CSI dynamics /i*(s) can be aligned with the virtual algorithm timescale Sn^ by adding a gain 
parameter to (jlj) as, 

dh' = -~~^'ds + ^[^dW.. (61) 
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/i'^(s) has the same shape as h'^{t), but with a different timescale. 



The same trick appUes to the long-term virtual CSI dynamics h'' (s), as dh'', = —j^colDj (s) '' ^vj {s)ds, 



Vj. In a vector form, we have 



dh^ = -^HL{t)ds (62) 



where Hi{t) is an x diagonal matrix, with the j-th diagonal element being coLDj{t) ' ^Vj{t). 
The virtual CSI timescale separation parameter becomes ?= j^^- 



We use a localization method | |43[ and consider the algorithm trajectories (pT])-(|22|) and ([30|)-([31]) 
start from (xo,yo) at time s = sq = 0, which lies in the neighborhood of {x*{h^ ,h}-),y*{h!-)). We 
have 



Xris — l 2;f;(Sna — l) 



+7 



where using Taylor expansion, 

term ([63]) = -yGx (^Xc{Sn,-l),yc{Snf),h''{Snf),h''{Snffj {Xn,-1 - Xc(Sn,-l)) 

-iGy [Xc(Sn,-l),yc{Snf),h''{Snf),h\snf)) {Vuf - VdSnf)) +0(7). 



(64) 



Here, it is reasonable to consider — Zxi^nJ = 0, since if the partial equilibrium Xc G X, we 
eventually have z^^n^ = Zx(snJ = 0. If Xc G dX, both trajectories eventually search along the 
boundary, and Zx^n^ = Zx(snJ for a large enough Ug. 



The term (64i is just a first order Taylor expansion of the continuous trajectory (|30|) at the point 



Xc{sn,-i)- By taking x2isr. 



^ {Xn^ Xc 



s„J) and yl{snj) = {vuf - ydsnf)), we have 



Gx [xc{sj),yc{sj),h''{sj),h (sj)) xjisj 



+Gy(xc{sj),yc{sj),h'{sj),h^{sj)) y^{sj) 



x2iO)+ / {Gxi.)x2 + Gy{.)y2)ds + oi^) 



+ 0(7) 



where j = [j/Ns\ frame index of the outer iteration when the inner iteration is at the j-th slot. 
Equivalently, 

dx2 = Gx{.)x2ds + Gy{.)y2ds + 0(7). (65) 



Similarly, we derive the dynamic y?(s) as follows. 
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Urif VciSuf) 
Vnj-l - ydSnf-l) 

+Ns^ [N^^K {Xnj,ynf-i;-) + Zy,nf - iV^^i^ (£c(Sn^-l) , ?/c(Sn^-l) ; •) - Zy{Snf)](66) 



+7 K [xciSnf~l),yciSnf-l); ■) + Zy{Snf) - A;(yc(Sn^-l ) , /l' (Sn^ ) ) - Zy{Snj) 



(67) 
(68) 



where n/ = NsUf is the slot index of the inner iteration when the outer iteration is at the nj-th 
frame . The 



term ( 66 1 



K {Xnj,ynf-i;-) - K {xc{Snj-l),ynf-i;-) 



+K [xc{Snf-l),ynf-i; •) - K [xc{Snf-l),yc{Sni-l); ■) 
= {Ns-i)N^'^Kx {xc{Snf-l),yc{Sns-l);-) {xwj-l - Xc(s„^-l) + Xc(Sn/-l) - Xc(s„^_l)) 
+ {Ns^)Ng ^Ky (x(s„^_i), yc(Sn/-l); ■) {ynj-l - ydSrif-l )) + o(7) 

= {Ns-f)N;:^K^ {xc{snf-i),yc{snf-iy,-) i'xj + xl'"') 

+{Nsj)N-'^Ky {xc{snf^i),yc{snf-i); ■) y2 + 0(7) 

by the first order Taylor expansion of function K{.) at {xc{snf-i),yc{snf-i))- Note that, since 

As = Snj - Snj-1= Ns-f, 



1 



7 



term ^ = N;^K^{xc, yc, .){x2 + xl'^)As + N;^Ky{xc, yc, -W^s + 0(7). 



(69) 



Consider y*{h^) G 3^ is in the interior of the domain. (The case when y*{h^) is on the boundary 
will be discussed later). Then it is reasonable to consider G y, with probability 1. We consider 



the following process for the term (67i, 

S'y{s) := 7 XI [^i^cisj),ycisj); .) - k{yc{sj); h^) 
i=i 

Choosing a sufficiently small 5 > 0, we have 

nf(s+5) ^ 

S^s + 5)- S^is) p. 7 [Kixdsj),y; ■) - Hy; h'] 

j=n/(s)+l 

where y = yc{s). The above is an approximation since we use ydsj) ~ yc{s) for nf{s) < j < 
nf{s + d). However, the approximation is asymptotically accurate for sufficiently small 6 and 7. The 
central limit theorem for the state dependent process suggests that ^ + S) — S'^{s)) weakly 
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converge to a normal random variable, where T,s,5 is the covariance matrix of S'^{s + 6) — S'^{s) 

1 



|36|. This implies that 



^/7 



{S^{s + 5)-S^{s)) 



s+S 



(70) 



where Wu is a standard Winner process and 

+00 

^iyciso);V) = 7 ^ cov K (^Xcisj),ycisj)]h{sj)^ ,K (^Xciso),yc{so)]h{so)^ \1? 

j=~oo 

is the covariance matrix of the estimator K{.) under the virtual long-term CSI state /i'. In addition, 
we have 



l\ A 



j=-oo 



K {xc{tj),yc{tj)] h{tj)) , K (xcito), ydto); h{to)) 



T 



to be the covariance matrix of the estimator K{.) under the CSI state h, since the virtual algorithm 
timescale s„ is -j^^ times denser than the implementation timescale t„. Therefore, from (69 1 and 



(70 1, we obtain 



dyl = N-'K,{xc, yc, .){x2 + xl'^)ds + N-^Ky{xc, yc, -Weds + ^^Hvc; h^)dWs + 0(7). 

Note that, in a finite horizon case for s G [0,Ts], by letting 7 — )• 0, one can drop the 0(7) term 
and obtain the convergence results {x2,yj) — ^ {xcVc), where (xc,yc) is the solution to (71 i-(72i. In 



addition, using the sophisticated techniques in |T9|, p6| , one can further prove the convergence in 
the infinite horizon case for s G [0, 00) and obtain the following, 

dxc = Gxixcyc.h" ,h!-)xcds + Gy{xc,yc,h\h})ycds 
dye = N^^K.j;{xc,yc,-){xc + xl)ds + N;^^Ky{xc,yc,-)ycds 



(71) 
(72) 



+^JTNs^T.h{y,^ h^)dWs + dZy. 
where S2 (y^, /i') = f^^{yc, h}) in the case for y*{h}) G y. 



Moreover, changing the MCTS (30l-(31 1 into to the virtual algorithm time s, we get 

dxc = G{xc, ych^ , h})ds + dZ^ 
dye = N-^k{yc,V)ds + dZy 



(73) 
(74) 



where dZ^ and dZy are reflection terms. Notice that dx^ = dxc — dxc and dy^ = dye — dy*. We 
obtain the following error dynamic system 



dxl 
dyl 



G(xe, yc, h\ h^)ds + dZ, + G-^Gh^ {.)dh' + G^^Gy{.)dye 



N-^k{yc, h^)ds + dZy - iph' {h})dh} 



(75) 
(76) 
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Consider the SDEs (71i-(72) and (75l-(76l, and the virtual CSI dynamics (61l-(62i. Rearranging 
the terms and changing the virtual time notation s to t, we obtain the VSDS in ( [37] ). This proves the 
claimed results. 

Remark 5 (The case y*{h}) on the boundary): When y*{h}') is on the boundary, we can follow the 
argument in | |43| to find out the behavior ofyc{s). Note that the corresponding effect is only on the 
diffusion term (y„ /J')(iVF„ where ^{vcM) = ^o{ycMyc,h^), and So(y) = diag ({cT°(y)},=^i), 
which is defined in the following. Consider the i-th component of y*{h}') is on the boundary. There 
are two cases. Case i), the i-th component of the drift k{y*; h^) is non-zero, which means there must 

(i) (i) (i) 

be a reflection force Zy ^ on yiif and yc to keep them from reaching out of the boundary. Then 
obviously, upon reaching y*^^\ yn] is not likely to be disturbed by the noise from the estimator K{.) 
[unless the noise is larger than the drift k^^\y*; h'-)], and the i-th component of the diffusion term 
T,2 (y^^ h'')dWs should be zero. Hence af = 0. Case ii), the i-th component of the drift k{y*]h}) is 



zero, which means the reflection force Zy depends on the disturbance noise. According to |43|, we 



can simply consider = 1, just as the case when y*{h^) is in the interior of y. ■ 

Appendix E 
Proof of TheoremH] 

A. The Lyapunov Drift of the VSDS 

It is equivalent to show the VSDS u{t) is stochastically stable. We first give the following lemma. 



Lemma 3 (Ito's lemma): Consider a stochastic process u{t) given by the following SDE, du = 
f{u)dt + g{u)dWt and a function V{u) G ]R_|_. The Lyapunov drift operator on V{.) can be written 
as£y=|^/(n)+trUnr& 



We first consider that the optimal solution y* is in the interior of y, which means the reflection 
term dZy = and the matrix Tp (defined in Appendix |d|) is an identity matrix. Then using Lemma 
|9) the Lyapunov drift of the stochastic process u{t) can be written as 



CV{u) 





T r 






Vc 















Vc 



Gxixc^yci') Gy{xciyci •) 
N-^Kx{xc,yc,-) N~^Ky{xc,yc,-) 

+yjN~'K,{x,, y,, .)xl + {x^f G{x,, .) + (x^f G-'Gh^ ('b^h^ 

+ ixtfG-'GyN-'kiy,,V) + (y^f N;'kiy,,V) 



Ns-f 
1 

+-tr 
2 



2iV,7 

Gx^Gh-)^ (G~^Gh-) + 



h^) /i^ + -tr(riV7iS(y,)) 



where Xc denotes the partial optimum Xc{yc, h^,h'';t 



ill) 
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B. The Upper Bound of the Drift 

We bound each term from the above equation in the following. 
First of all, from Assumption [4j we can show that 

GxixciVcih) Gy{xc,ych) 



Qc 



Xc Uc 



N-^Kx{xc,yc,h) N-^Kyixc,yc,h) 
< -N-^a{\\xcf + \\ycf). 













yc 



(78) 



Secondly, the mapping G{xc, ■) on the x part has the property 

{xtfG{x,,.) = {xlf [' Gx{xl + Cxl,.)xldC 

Jo 



Thirdly, the mean mapping k{yc, h}) on the y part satisfies 

{y^fk{y,,V) = (y^f lim E fey^/iO 

n— >oo 

-1 



(79) 



lim E 

n— >oo 



< lim E 

n—>-oo 



{y. 



■e\T 



Ky{yl+iyl,h')yld^ 



-UyWylW^di 



(80) 



In addition, since K{y* ,h^) = due to the property of the stationary point y* G y, we have 

\\k{y,,V)\\ = lim nK{yc,y)\\ 

= \\TnnK{yc,y)-K{y:,V)\\ 



< lim My\\yc - yl\ 

n— ^oo 



(81) 



where the inequality is from the Lipschitz property in Assumption |4] 



Finally, using the above result, we can find an upper bound for the Lyapunov drift in (77 1 as 



CV < -N-'a{\\x,f + \\ycf) + ^\\yc\\\\x^,\ 



_ 1 _ ~ 

2 A's7 



^yh ii~e 



ii~eii lo^r 

^WyJ ' 



h'f + Co{y*{h')) 



= m 



(82) 



(83) 



where Co{y*{h^)) = ^tT{T,{y*)) + Ij^vj^N + ^j^N is from the trace terms in (77i. The 



parameter N is the dimension of the parameters h'^ and /i'. : x ^ is the upper bound drift 
function and x = (ll^clli l|yc||, ll^cll' llycll' 11^*11) is the vector measuring the deviations of each virtual 
state from the origin. 
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Note that (p^x) is a quadratic function and we can write it as 



X^Ax + b^x + Co 



where 



A 







2JV, 








Vyly 


VhCLhT 

4Af,7 


Vyly 


ay 










1 OffT 

2 AT, 7 



(84) 



[0,0,0,f^,0]^ and Co 



According to Lemma [2j a sufficient condition for the VSDS u{t) to be mean square stable is 
that the function CV{u) < </>(x) can be further upper bounded by 4'{x) < ~f{x) + Cf, where 
fix) = cx^X' for some constant c > and Cj < oo. This is equivalent to verifying if the function 
4>{x) + fix) = ~ cl)x + ft^X + Co is bounded above. Therefore, we only need to check the 

positive definite property of the coefficient matrix A. To do this, we can calculate each of the leading 
principle minors of A, and make them positive. These calculations lead to the sufficient condition 
( |39l ) in Theorem ^ 

Remark 6 (The case on the boundary): When the optimal solution y* is on the boundary 

of the constraint domain, we may have non-zero dZy and a non-identity matrix So(yc)- We may 
have the following two modifications in the above flow, (i) The term [Kxi-) (5?c + ^c) + ^yi')yc\ 
in (77 1 and (78 1 now becomes [Kxi-) ixc + x'^c) ~^ ^yi')yc + dZy\. Since the term about the 
X variable Kxi-) {xc + x'^c) ^oes, not contribute to the reflection dZy, we only need to evaluate 
yj^ [Ky{.)yc + dZy\. Note that, when there is a non-zero reflection on the i-th component of y^, 



we must have y^f^ = for most of the time. Thus we still have y^ [Ky{.)yc + dZy] y^ < —ctyWyc 



as we did in ( [781 ). (ii) Consider the covariance matrix in ( [72| ). Its trace must be smaller than the case 
when y*ih^) is in the interior, since there are some zero diagonal elements in the matrix So(y*) [see 
Appendix [D|. Thus the constant Co defined above is still an upper bound. Combining the cases (i) 
and (ii), the optimal solution y* being on the boundary does not change the upper bound of the drift 



as in (82 1. 



Appendix F 
Proof of Theorem[5] 

Choose a function /(x) = cx^^X for some constant < c < 1. We have /(x) ^ cAmm(^)||xlP> 
where Amin(^) denotes the smallest eigenvalue of matrix A. 
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Under the sufficient condition in Theorem |4] and from (83 1, we have the stochastic Lyapunov drift 
be upper bounded by 

CV[u) < cPixiu)) = -x'Ax + h'x + Co < -fix) + Ci 



2 



Where C, = Co + 3^, and C, = A'^b = ) ^,^^^sL-a:.%y2kX-2il^il^i < 
Then, using Lemma [2| we have 



lim sup - / E||'u(r)|p(ir < — — — x lim sup - / [ Cq -\ I dr 

t^oo t Jq cAmin(^) i->oo i Jo \ 1 " C/ 



cAmin(^) V 1 - 

where Cq = [rS + '^N{1 + vj^)] , and S = limsupi_^^ \ tr (S(y*(/i'(r)))) dr is the time- 
averaged covariance matrix of the estimator K{.). Choosing c to minimize the above upper bound, 
we obtain c* = ^=^7=f^. With the observation that Cb(e^) can typically be very small, due to the small 
timescale separation parameter e ^ 1. Then we choose c = | for a reasonable tight upper bound of 
the tracking error. 

We now derive the term Amin(A), by using the eigenvalue lower bound Amin(^) > 2"/'^'^^i'||A||fr ^^"^ 



1 44 1, where || . ||f denotes the Frobenius norm and n is the dimension of A. We apply a trick to obtain 
a good eigenvalue bound by letting Aq = NsA. Then Amin(^) = iV^Uminl^o) > 2^/'4ao\\[ = 
N-^^, where p = |det (^o)l = ^ [sNsaa^Oy - auy^vj^ - 2ayll - 2allvl) = 0{N,a'^a^ay) 
and r] = 2'^I'^\\Aq\\f = O (^J N'^a'l + aA . Therefore, we have 



limsup^ rE||n(r)|pdr < —L^^(-^\rJ: + ^N{l + vj,)\+C,{e')) 

t^OO t Jo 7^I\s ^ V^JVs L 7 J / 

rS + C) 



7/ 
P 

where C = '^N{1 + vj^) + 4Cb{e) = '^N{1 + vj^) + ©(e^ro^V^) 



Since ||xc|p + + ||yc|P + ||2/clP < IklP- we have + Cy < ^ (^S + C). 

Appendix G 
Proof of Theorem[6] 

Using the compensated MCTS (|43])-(|44]) and the CSI dynamics ([l])-(|2]), the dynamic system ( [45] ) 



( [461 ) can be written as 



+ i^li.) - h% h')dh' + - 



"^"dm (85) 



N-'k{yc,V)dt + (v9^(7i') - ^'^iy„h',V)) dV (86) 
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Consider two Lyapunov functions Vi{xl) = | (x^)^x^ and ^2(2/^) = ^(2/0)^2/0 defined along tlie 
trajectory of the virtual system ([85])-([86]). 

Define ^^{yc,h',hi) ^ ||(/pj;(7i') - ip'^{y„h' ,hi)\\ - E\\^'^(hi) - ^^{y„h',h% Note that ^^{.) 
acts like a "noise" term that depends on h'^ and satisfies = 0. Using Lemma 3 the Lyapunov 



drift of V2(.) is given by 



dh' 

dt 



< 



y IIJre||2 



+ 



L 



Ns7 



a. 



< 



+ ~eph^\\yl\\+eey{.)\\y\ 



-c\m+eey{-)\K\\ + 



(87) 



mm ) i^max ) 



2-^min ^^max, Oly and e 



j—e according to the mobility model and CSI 



where e{D 

model in Section |II-A[ c > is an arbitrary constant, and the first inequality is from ( |80| ). 



Notice that E 
of the tracking error 



4(5„-?L^) 



. Using Lemma 



as 



< 



Ac{ay-eL^)' 



we obtain the upper bound 
(88) 



El 



(89) 



A tight upper bound can be given by minimizing ( |88| ) over c > 0. Then we obtain 

ay - ay-f - erL^ 

To study define = W\\ -^\h''\\, where E||/J'^|| = according to the CSI model in 

\\. Define = ||y^|| — E||y^||. Note that and only depend on and y^, respectively, and 

nh = nWc\\ = ^- 

We then derive the Lyapunov drift of Vi{.) as follows. 



{xiY G{.) + {xiY 



a-HT 



h^]+{xlf{^U-)-V>U-))N-'k{y,,h') 



+ i^%)-^U-))^viyc,-)dhi+tr 



Ns7 



T' 



< 



i^ll + Ch) + Ly\\xlfN-Hy{E\ 



Ns7 



e||2 



(90) 



< 



a. 



ay-i 



xe||2 



(91) 



(c.f. equations (79 1 and (81 1 for the first inequality), where in (90 1, the d/i' term is dropped, since 
dh'- is much smaller than the dh^. 
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Note that the last two terms in dOTl) have mean due to the "noise" terms and ^y. Therefore, 



according to Lemma 
to in probability. 



if a. 



2^Ar,7^^ ^^^^^ ''y ay-y-erL" 



N7r ' ^ 



> 0, then xl converges 



Appendix H 
Proof of Theorem|7] 

We have the iteration mappings = -§^L{.), -^L(.) ) and K{.) = §^L{.). In addition, 
we have G^{.) = Ml{.) = | ^^^^ ^^^^ 1 and Ky{.) = From the convex 



' dXd-p 



y\'l ~ drdr 

L{.) OjxJ 



assumption, Ky{.) is negative definite. It follows that ay{.) = — Amax (^gf^-^(") 

Note that, the Hessian of the Lagrange function QpQp L{.) is negative definite. Then according to 
|32 Proposition 4.4.2], the matrix Ml{.) is also negative definite. 

To derive the convergence for the x part, we consider a Lyapunov function V{xe) = ^xjxg, 
where From a modification of Ito's formula in Lemma js) it follows that CV = 
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